founders

Below is an auto-generated summary of the xftsim.founders submodule API.

xftsim.founders.founder_haplotypes_from_AFs(n, afs, diploid=True)

Generate founder haplotypes from specified allele frequencies.

Parameters:
  • n (int) – Number of haplotypes to simulate.

  • afs (Iterable) – Allele frequencies as an iterable of floats.

  • diploid (bool, optional) – Flag indicating if the generated haplotypes should be diploid or haploid.

Returns:

xft.struct.HaplotypeArray – An object representing a set of haplotypes generated from the given allele frequencies.

Reads in PLINK 1 binary genotype data and returns a HaplotypeArray object containing pseudo-haplotypes by randomly assigning haplotypes at heterozygous sites.

Parameters:

path (str) – The file path to the PLINK 1 binary genotype data.

Returns:

xr.DataArray – Founder Pseudo-haplotype array with samples indexed by an xftsim.index.SampleIndex object and variants indexed by an xftsim.index.HaploidVariantIndex object. The “pseudo-” prefix refers to the fact that the plink bfile format doesn’t track phase.

xftsim.founders.founder_haplotypes_from_sgkit_dataset(gdat)

Construct founder haplotypes array from sgkit DataArray. Useful in conjuction with sgkit.io.vcf.vcf_to_zarr() and sgkit.load_dataset()

Parameters:
  • gdat (xr.Dataset) – Dataset generated by sgkit.load_dataset()

  • generation (int) – Used to populate the generation attribute of xftsim.index.SampleIndex

Returns:

xr.DataArray – Array of founder haplotypes with samples indexed by an xftsim.index.SampleIndex object and variants indexed by an xftsim.index.HaploidVariantIndex object.

xftsim.founders.founder_haplotypes_uniform_AFs(n, m, minMAF=0.1)

Generate founder haplotypes from uniform-distributed allele frequencies.

Parameters:
  • n (int) – Number of haplotypes to simulate.

  • m (int) – Number of variants.

  • minMAF (float, optional) – Minimum minor allele frequency for generated haplotypes.

Returns:

xft.struct.HaplotypeArray – An object representing a set of haplotypes generated with uniform allele frequencies.