Architecture & Components

The architecture system defines the phenogenetic architecture as a directed acyclic graph (DAG) of ArchNode objects. Each node wraps an ArchComponent that computes one piece of the phenotype model.

Architecture

class xftsim.arch.Architecture(formula=None, effects=None)[source]

Bases: object

Phenogenetic architecture: a DAG of ArchNodes executed each generation.

Can be constructed programmatically via add() or from a formula string (parsed by the parser module). Nodes are topologically sorted so that dependencies are resolved before dependents.

Parameters:
  • formula (str, optional) – Multi-line formula string (parsed into ArchNodes). See parser.parse_formula for the grammar.

  • effects (dict, optional) – Name -> EffectSpec mapping for resolving effect references in the formula.

Parameters:

Examples

Programmatic construction:

>>> from xftsim.effect import AdditiveEffects
>>> eff = AdditiveEffects.from_h2(h2=0.5, m=100, seed=1)
>>> arch = Architecture()
>>> arch.add('Y.G', GeneticComponent(eff))
>>> arch.add('Y.E', NoiseComponent(0.5))
>>> arch.add('Y', AggregationComponent('Y.G + Y.E'))

Formula construction:

>>> arch = Architecture(
...     formula="""
...     Y.G ~ genetic(eff)
...     Y.E ~ noise(0.5)
...     Y ~ Y.G + Y.E
...     """,
...     effects={'eff': eff},
... )
Parameters:
classmethod from_formula(formula, effects=None)[source]

Construct an Architecture from a DSL formula string.

Parameters:
  • formula (str) – Multi-line formula string (see parser module for grammar).

  • effects (dict, optional) – Name → EffectSpec mapping for resolving effect references.

Return type:

Architecture

Returns:

Architecture

Parameters:
add(outputs, component, inputs=None, grouping=None)[source]

Programmatically add a node to the architecture.

Parameters:
  • outputs (str or list[str]) – Output name(s).

  • component (ArchComponent) – The component to execute.

  • inputs (list[str], optional) – Input names (for aggregation). Auto-detected for AggregationComponent.

  • grouping (str, optional) – Grouping variable.

Parameters:
Return type:

None

property nodes: list[ArchNode]

Return the topologically sorted node list.

compute(haplotypes, phenotypes=None, rng=None, **kwargs)[source]

Execute all nodes in topological order.

Parameters:
  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray, optional) – Existing phenotype array to write into. Created if None.

  • rng (np.random.RandomState, optional) – Random state for noise components.

  • **kwargs – Additional context (phenotype_history, pedigree_history, generation).

Return type:

PhenotypeArray

Returns:

PhenotypeArray – The phenotype array with all computed values.

Parameters:

ArchNode

class xftsim.arch.ArchNode(outputs, component, inputs=<factory>, grouping=None)[source]

Bases: object

A single node in the architecture DAG.

Parameters:
  • outputs (list[str]) – Names written to PhenotypeArray.

  • component (ArchComponent) – The computation to perform.

  • inputs (list[str]) – Names read from PhenotypeArray (for aggregation) or [] (for generative).

  • grouping (str or None) – Grouping variable for | operator, or None (implicit | IID).

Parameters:
outputs: list[str]
component: ArchComponent
inputs: list[str]
grouping: str | None = None

Component ABC

class xftsim.arch.ArchComponent[source]

Bases: ABC

Abstract base class for architecture components (DSL built-in functions).

name

Component name (e.g. ‘genetic’, ‘noise’).

Type:

str

kind

One of ‘genetic’, ‘generative’, ‘aggregating’.

Type:

str

accepts_grouping

Whether this component can use the | operator.

Type:

bool

name: str = ''
kind: str = ''
accepts_grouping: bool = False
abstract compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:

Genetic Components

class xftsim.arch.GeneticComponent(effects)[source]

Bases: ArchComponent

Univariate genetic component: computes diploid G @ effects.

Uses standardized_matvec when effects.standardized is True, otherwise plain matvec.

Parameters:

effects (EffectSpec) – Effect sizes (shape (m,) for univariate). Typically an AdditiveEffects or SparseEffects instance.

Parameters:

effects (EffectSpec)

name: str = 'genetic'
kind: str = 'genetic'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.MVGeneticComponent(effects)[source]

Bases: GeneticComponent

Multivariate genetic component: computes G @ effects for k traits.

Inherits compute() from GeneticComponent since numpy’s matvec handles both 1D and 2D effect arrays.

Parameters:

effects (EffectSpec) – Effect sizes with shape (m, k). Typically a MultivariateEffects instance.

Parameters:

effects (EffectSpec)

name: str = 'mvGenetic'
class xftsim.arch.HaplotypeGeneticComponent(effects, haplotype='maternal')[source]

Bases: ArchComponent

Haplotype-specific genetic component.

Computes hap[:,:,0] @ effects (maternal) or hap[:,:,1] @ effects (paternal). Enables indirect genetic effects (IGE) formulas where maternal and paternal contributions are modeled separately.

Parameters:
  • effects (EffectSpec) – Effect sizes (shape (m,)).

  • haplotype (str) – Which haplotype copy to use: 'maternal' or 'paternal'.

Raises:

ValueError – If haplotype is not 'maternal' or 'paternal'.

Parameters:
name: str = 'haplotypeGenetic'
kind: str = 'genetic'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:

Noise Components

class xftsim.arch.NoiseComponent(variance)[source]

Bases: ArchComponent

Univariate noise component.

Draws iid N(0, variance) per individual. When grouping is active (e.g., | FID), draws one shared value per group and broadcasts to all members.

Parameters:

variance (float) – Noise variance (used as the variance of the normal distribution).

Parameters:

variance (float)

name: str = 'noise'
kind: str = 'generative'
accepts_grouping: bool = True
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.CNoiseComponent(cov)[source]

Bases: ArchComponent

Correlated multivariate noise component.

Draws N(0, cov) per individual. When grouping is active, draws one shared vector per group and broadcasts. Returns an (n, k) array.

Parameters:

cov (np.ndarray) – (k, k) covariance matrix. Must be square.

Raises:

ValueError – If cov is not a square matrix.

Parameters:

cov (ndarray)

name: str = 'cnoise'
kind: str = 'generative'
accepts_grouping: bool = True
property k: int

Number of correlated traits.

compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:

Aggregation

class xftsim.arch.AggregationComponent(expression)[source]

Bases: ArchComponent

Aggregation component: evaluates arithmetic expressions over phenotype values.

Uses a custom tokenizer + shunting-yard evaluator (no eval()). Supports +, -, *, /, scalar multiplication, parentheses, and dotted names (e.g., 'Y.G + Y.E').

Parameters:

expression (str) – Arithmetic expression referencing phenotype names, e.g. 'height.G + height.E' or '0.5 * (Y.G + Y.E)'.

Parameters:

expression (str)

name: str = 'aggregation'
kind: str = 'aggregating'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:

Parental Components

class xftsim.arch.ParentComponent(phenotype_name, founder_component=None, normalize=False)[source]

Bases: _ParentalComponent

Vertical transmission: midparent (average of mother and father).

Parameters:
  • phenotype_name (str) – Phenotype to look up in previous generation.

  • founder_component (ArchComponent, optional) – Fallback at generation 0.

  • normalize (bool) – Standardize parental values before lookup. Default False.

Parameters:
name: str = 'parent'
class xftsim.arch.MotherComponent(phenotype_name, founder_component=None, normalize=False)[source]

Bases: _ParentalComponent

Vertical transmission: mother’s phenotype from previous generation.

Parameters:
  • phenotype_name (str) – Phenotype to look up in previous generation.

  • founder_component (ArchComponent, optional) – Fallback at generation 0.

  • normalize (bool) – Standardize parental values before lookup. Default False.

Parameters:
name: str = 'mother'
class xftsim.arch.FatherComponent(phenotype_name, founder_component=None, normalize=False)[source]

Bases: _ParentalComponent

Vertical transmission: father’s phenotype from previous generation.

Parameters:
  • phenotype_name (str) – Phenotype to look up in previous generation.

  • founder_component (ArchComponent, optional) – Fallback at generation 0.

  • normalize (bool) – Standardize parental values before lookup. Default False.

Parameters:
name: str = 'father'

Sibling Components

class xftsim.arch.SiblingMeanComponent(source_name)[source]

Bases: _SiblingComponent

Sibling mean: average of source phenotype within group.

Parameters:

source_name (str)

name: str = 'sibling_mean'
class xftsim.arch.SiblingSumComponent(source_name)[source]

Bases: _SiblingComponent

Sibling sum: sum of source phenotype within group.

Parameters:

source_name (str)

name: str = 'sibling_sum'
class xftsim.arch.SiblingAnyComponent(source_name)[source]

Bases: _SiblingComponent

Sibling any: 1.0 if any member in group has value > 0, else 0.0.

Parameters:

source_name (str)

name: str = 'sibling_any'
class xftsim.arch.SiblingCountComponent(source_name)[source]

Bases: _SiblingComponent

Sibling count: number of individuals in each group.

Parameters:

source_name (str)

name: str = 'sibling_count'
class xftsim.arch.SiblingEldestComponent(source_name)[source]

Bases: _SiblingComponent

Sibling eldest: value of the first (lowest IID) member in each group.

Parameters:

source_name (str)

name: str = 'sibling_eldest'
class xftsim.arch.SiblingYoungestComponent(source_name)[source]

Bases: _SiblingComponent

Sibling youngest: value of the last (highest IID) member in each group.

Parameters:

source_name (str)

name: str = 'sibling_youngest'

BUILTINS Registry

xftsim.arch.BUILTINS: dict[str, type[ArchComponent]]

Registry mapping DSL function names to ArchComponent subclasses.

Used by the formula parser to resolve function calls like genetic(eff) to the corresponding component class.