founders
Below is an auto-generated summary of the xftsim.founders submodule API.
- xftsim.founders.founder_haplotypes_from_AFs(n, afs, diploid=True, generation=0)[source]
Generate founder haplotypes from specified allele frequencies.
- Parameters:
n (
int) – Number of individuals to simulate.afs (
Iterable) – Allele frequencies as an iterable of floats (one per variant).diploid (
bool, optional) – Flag indicating if the generated haplotypes should be diploid or haploid.generation (
int, optional) – Generation number for the founders. Default is 0.
- Return type:
- Returns:
xft.struct.DenseHaplotypeArray– An object representing a set of haplotypes generated from the given allele frequencies.- Parameters:
- xftsim.founders.founder_haplotypes_uniform_AFs(n, m, minMAF=0.1, generation=0)[source]
Generate founder haplotypes from uniform-distributed allele frequencies.
- Parameters:
- Return type:
- Returns:
xft.struct.DenseHaplotypeArray– An object representing a set of haplotypes generated with uniform allele frequencies.- Parameters:
- xftsim.founders.founder_haplotypes_from_sgkit_dataset(gdat, generation=0)[source]
Construct founder haplotypes array from sgkit DataArray. Useful in conjuction with sgkit.io.vcf.vcf_to_zarr() and sgkit.load_dataset()
- xftsim.founders.founder_haplotypes_from_plink_bfile(path, generation=0)[source]
Reads in PLINK 1 binary genotype data and returns a DenseHaplotypeArray object containing pseudo-haplotypes by randomly assigning haplotypes at heterozygous sites.
- Parameters:
- Return type:
- Returns:
xft.struct.DenseHaplotypeArray– Founder Pseudo-haplotype array. The “pseudo-” prefix refers to the fact that the plink bfile format doesn’t track phase.- Parameters:
- xftsim.founders.founder_haplotypes_from_msprime_grg(n, sequence_length, Ne=10000, recombination_rate=1e-08, mutation_rate=1e-08, generation=0, *, binary_muts=False, use_node_times=False, no_simplify=False, maintain_topo=False, ts_coals=False)[source]
Generate founder haplotypes using msprime and return them as a GraphHaplotypeOperator.
This function simulates ancestry and mutations for a population of size n over a sequence of length sequence_length, then converts the resulting TreeSequence into a Genotype Representation Graph (GRG) via the grgl CLI.
- Parameters:
n (
int) – Number of diploid individuals to simulate.sequence_length (
int) – The length of the genomic region to simulate (in base pairs).Ne (
float, optional) – Effective population size. Default is 10000.recombination_rate (
float, optional) – Recombination rate per base pair per generation. Default is 1e-8.mutation_rate (
float, optional) – Mutation rate per base pair per generation. Default is 1e-8.generation (
int, optional) – Generation number for the founders. Default is 0.binary_muts (
bool, optional) – Flag to pass –binary-muts to grgl.use_node_times (
bool, optional) – Flag to pass –ts-node-times to grgl.no_simplify (
bool, optional) – Flag to pass –no-simplify to grgl.maintain_topo (
bool, optional) – Flag to pass –maintain-topo to grgl.ts_coals (
bool, optional) – Flag to pass –ts-coals to grgl to calculate diploid coalescence information.
- Return type:
- Returns:
xft.struct.GraphHaplotypeOperator– The operator containing the simulated founder graph and metadata.- Parameters:
- xftsim.founders.founder_haplotypes_from_stdpopsim_grg(samples, model_id='OutOfAfrica_3G09', chromosome='chr22', species_id='HomSap', genetic_map=None, left=None, right=None, mutation_rate=None, engine_name='msprime', generation=0, *, binary_muts=False, use_node_times=False, no_simplify=False, maintain_topo=False, ts_coals=False)[source]
Generate founder haplotypes from a stdpopsim demographic model and return them as a GraphHaplotypeOperator.
Simulates a TreeSequence using a published stdpopsim demographic model (e.g.
HomSap/OutOfAfrica_3G09) and converts the result to a Genotype Representation Graph (GRG) via the grgl CLI.- Parameters:
samples (
Dict[str,int]) – Mapping of stdpopsim population name to number of diploid individuals to draw from that population (e.g.{"YRI": 100, "CEU": 100, "CHB": 100}). The available population names depend on the chosen demographic model.model_id (
str, optional) – Identifier of the stdpopsim demographic model. Default is"OutOfAfrica_3G09".chromosome (
str, optional) – Chromosome identifier passed tospecies.get_contig. Default is"chr22".species_id (
str, optional) – Species identifier used by stdpopsim. Default is"HomSap"(Homo sapiens).genetic_map (
strorNone, optional) – Optional stdpopsim genetic map identifier (e.g."HapMapII_GRCh38"). IfNone, the contig uses a uniform recombination rate.left (
intorNone, optional) – Left coordinate (in base pairs, 0-based inclusive) of a sub-region of the chromosome to simulate. IfNone, simulation starts at position 0. Use together withrightto shorten simulations for faster tests.right (
intorNone, optional) – Right coordinate (in base pairs, exclusive) of a sub-region of the chromosome to simulate. IfNone, simulation runs to the end of the contig.mutation_rate (
floatorNone, optional) – Override for the contig’s mutation rate. IfNone, defaults to the demographic model’s calibrated mutation rate when one is published (model.mutation_rate); otherwise stdpopsim’s species/contig default is used.engine_name (
str, optional) – stdpopsim simulation engine to use. Default"msprime".generation (
int, optional) – Generation number for the founders. Default is 0.binary_muts (
bool, optional) – Flag to pass –binary-muts to grgl.use_node_times (
bool, optional) – Flag to pass –ts-node-times to grgl.no_simplify (
bool, optional) – Flag to pass –no-simplify to grgl.maintain_topo (
bool, optional) – Flag to pass –maintain-topo to grgl.ts_coals (
bool, optional) – Flag to pass –ts-coals to grgl to calculate diploid coalescence information.
- Return type:
- Returns:
xft.struct.GraphHaplotypeOperator– The operator containing the simulated founder graph and metadata.- Parameters:
Notes
Sample IIDs are prefixed with the stdpopsim population name (e.g.
"YRI_0","YRI_1", …). The full per-individual population label is also stored onsamples.extra["population"]so it can be used as a grouping variable byxftsim.arch.GroupingComponent.Variant positions and alleles are read directly from the GRG;
pos_cMis computed by integrating the contig’s recombination map.vidis formatted as"{chromosome}:{pos_bp}:{ref}:{alt}".