utils
Below is an auto-generated summary of the xftsim.utils submodule API.
- class xftsim.utils.ConstantCount(count)
Bases:
VariableCount
Class representing a constant count of individuals in a population.
- draw
a function that generates an array of counts
- Type:
Callable
- expectation
expected count
- Type:
float
- nonzero_fraction
the fraction of the population that is nonzero
- Type:
float
- Parameters:
count (
int
) – The constant count of individuals in the population.
- class xftsim.utils.MixtureCount(componentCounts, mixture_probabilities)
Bases:
VariableCount
Class representing a mixture of VariableCounts of individuals in a population.
- draw
a function that generates an array of counts
- Type:
Callable
- expectation
expected count
- Type:
float
- nonzero_fraction
the fraction of the population that is nonzero
- Type:
float
- Parameters:
componentCounts (
Iterable
) – An iterable of VariableCount instances, representing the components of the mixture.mixture_probabilities (
NDArray[Shape[``
”*”``]
,Float64]
) – An array of probabilities associated with each component in the mixture.
- class xftsim.utils.NegativeBinomialCount(r, p)
Bases:
VariableCount
Class representing a negative binomial-distributed count of individuals in a population.
- draw
a function that generates an array of counts
- Type:
Callable
- expectation
expected count
- Type:
float
- nonzero_fraction
the fraction of the population that is nonzero
- Type:
float
- Parameters:
r (
float
) – The number of successes in the negative binomial distribution.p (
float
) – The probability of success in the negative binomial distribution.
- class xftsim.utils.PoissonCount(rate)
Bases:
VariableCount
Class representing a Poisson-distributed count of individuals in a population. .. attribute:: draw
a function that generates an array of counts
- type:
Callable
- expectation
expected count
- Type:
float
- nonzero_fraction
the fraction of the population that is nonzero
- Type:
float
- Parameters:
rate (
float
) – The Poisson rate parameter.
- class xftsim.utils.VariableCount(draw, expectation=None, nonzero_fraction=None)
Bases:
object
A class to represent random count variables
…
- draw
a function that generates an array of counts
- Type:
Callable
- expectation
expected count
- Type:
float
- nonzero_fraction
the fraction of the population that is nonzero
- Type:
float
- None()
- property expectation
Getter function for expectation attribute.
- Returns:
float
– Expected count.
- property nonzero_fraction
Getter function for nonzero_fraction attribute.
- Returns:
float
– The fraction of the population that is nonzero.
- class xftsim.utils.ZeroTruncatedPoissonCount(rate)
Bases:
VariableCount
Class representing a zero-truncated Poisson-distributed count of individuals in a population.
- draw
a function that generates an array of counts
- Type:
Callable
- expectation
expected count
- Type:
float
- nonzero_fraction
the fraction of the population that is nonzero
- Type:
float
- Parameters:
rate (
float
) – The Poisson rate parameter prior to zero-truncation.
- xftsim.utils.cartesian_product(*args)
Returns a list of columns comprising a cartesian product of input arrays. Emulates R function expand.grid()
- Parameters:
*args (
NDArray[Any
,Any]
) – The input arrays.- Returns:
List[NDArray[Any
,Any]]
– The list of columns.
- xftsim.utils.cov2cor(A)
Converts covariance matrix to correlation matrix.
Parameters:
- A: Union[np.ndarray, pd.DataFrame, xr.DataArray]
Input covariance matrix.
Returns:
- Union[np.ndarray, pd.DataFrame, xr.DataArray]
Correlation matrix.
Raises:
None
- xftsim.utils.ensure2D(x)
Ensures the input array is 2D, by adding a new dimension if needed.
- Parameters:
x (
arraylike
) – The input array, by default None.- Returns:
NDArray[Any
,Any]
– The 2D input array.- Raises:
ValueError – If the input array is not valid.
- xftsim.utils.exhaustive_enumerate(a, n_per_a)
Repeat each ith element of array a integer n_per_a[i] times such that each every element appears min(j, n_per_a[i]) times in order before any element appears j+1 times.
Parameters:
- aarray-like
1-D array of any shape and data type.
- n_per_aarray-like
1-D array of int, representing the number of times each element in a needs to be repeated.
Returns:
- outarray-like
1-D array of shape (n,) and the same data type as a, where each element is repeated as per n_per_a in the order before any element appears j+1 times.
Raises:
Warning : If the output array is empty.
Examples:
>>> exhaustive_enumerate(np.array((1, 2, 3, 4)), np.array((3, 2, 1, 0))) array([1, 2, 3, 1, 2, 1])
- xftsim.utils.exhaustive_permutation(a, n_sample)
Returns a random permutation of the input array, such that each element is selected exactly once before any element is selected twice, and so forth
Parameters:
- aNDArray[Shape[“*”], Any]
A numpy array to be permuted.
- n_sampleint
An integer specifying the size of the permutation to be returned.
Returns:
- np.ndarray
A 1D numpy array containing the permuted elements.
- xftsim.utils.ids_from_generation(generation, indices=None)
Generates and returns a new array of IDs using the given generation number and the given indices. The new array contains the given indices with the generation number prefixed to each index.
- Parameters:
generation (
int
) – The generation number to use in the prefix of the IDs.indices (
NDArray[Shape[``
”*”``]
,Int64]
, optional) – A numpy array of indices.
- Returns:
ndarray
– A new numpy array of IDs with the given generation number prefixed to each index.
- xftsim.utils.ids_from_generation_range(generation, n=None)
Returns an array of string IDs of length n, created by concatenating the input generation with an increasing sequence of integers from 0 to n-1.
Parameters:
- generationint
An integer representing the generation of the IDs to be created.
- nNDArray[Shape[“*”], Int64], optional (default=None)
An integer specifying the number of IDs to be generated. If None, a range of IDs starting from 0 is created.
Returns:
- np.ndarray
A 1D numpy array containing the IDs in string format.
- xftsim.utils.ids_from_n_generation(n, generation)
Creates an array of individual IDs based on the specified number of elements and generation.
- Parameters:
n (
int
) – The number of individuals.generation (
int
) – The generation number.
- Returns:
numpy.ndarray
– An array of individual IDs.
- xftsim.utils.match(a, b)
Finds the indices in b that match the elements in a, and returns the corresponding index of each element in b.
Parameters:
- aList[Hashable]
List of elements to find matches for.
- bList[Hashable]
List of elements to find matches in.
Returns:
- List[int]
A list of indices in b that match the elements in a.
- xftsim.utils.matching_indices_conditional(a, b, condition)
Returns the indices of matches between a and b arrays, given a boolean condition.
- xftsim.utils.merge_duplicate_pairs(a, b, n, sort=False)
Merge duplicate pairs of values in a and b based on their corresponding values in n.
Parameters:
- aNDArray[Shape[“*”], Any]
First array to merge.
- bNDArray[Shape[“*”], Any]
Second array to merge.
- nNDArray[Shape[“*”], Any]
Array of corresponding values that determine how the duplicates are merged.
- sortbool, optional
Whether to sort the values in a and b before merging the duplicates. Default is False.
Returns:
- Tuple[NDArray[Shape[“*”], Any], NDArray[Shape[“*”], Any], NDArray[Shape[“*”], Any]]
The merged arrays, with duplicates removed based on the corresponding values in n.
- xftsim.utils.merge_duplicates(it)
Merge duplicates in the input array by checking if any pasted elements are the same.
- Parameters:
it (
Iterable
) – A numpy array with elements to be checked for duplication.- Returns:
list
– Returns the input list with duplicates merged if present.
- xftsim.utils.paste(it, sep='_')
Concatenates elements in a list-like object with a specified separator.
- Parameters:
it (
list-like
) – The list-like object containing elements to concatenate.sep (
str
, optional) – The separator used to concatenate the elements. Defaults to “_”.
- Returns:
numpy.ndarray
– An array of concatenated string elements.
- xftsim.utils.print_tree(x, depth=0)
Print dict of dict(of dict(…)s)s in easy to read tree similar to bash program ‘tree’ Modified from https://stackoverflow.com/questions/47131263/python-3-6-print-dictionary-data-in-readable-tree-structure
- Parameters:
x (
Any
) – Dict of dicts
- xftsim.utils.profiled(call, level=1, message=None, sep=' | ')
A decorator that prints the duration of a function call when the specified logging level is met.
- Parameters:
call (
function
) – The function being decorated.level (
int
, optional) – The logging level at which the duration of the function call is printed. Defaults to 1.message (
str
, optional) – A custom message to display in the log output. If not provided, the name of the decorated function will be used.
- Returns:
TYPE
– Description
- xftsim.utils.sort_and_paste(x)
Sorts the input array in ascending order and concatenates the first element with an underscore separator followed by the second element.
Parameters:
- xarray-like
1-D array of any shape and data type.
Returns:
- outarray-like
1-D array of strings with shape (n,) and the same length as x, where each element is formed by concatenating two sorted string representations of each element in x, separated by an underscore.
Examples:
>>> sort_and_paste(np.array((3, 1, 2))) array(['1_2', '2_3', '1_3'], dtype='<U3')
- xftsim.utils.standardize_array(a)
Standardizes columns of a 2D array.
Parameters:
- a: ArrayLike
Input 2D array.
Returns:
- np.ndarray
Standardized 2D array.
Raises:
None
- xftsim.utils.standardize_array_hw(haplotypes, af)
Wraps _standardize_array_hw to prevent segfaults.
Parameters:
- haplotypes: NDArray[Shape[”,”], Int8]
Input array of int8 haploid genotypes.
- af: NDArray[Shape[“*”], Float]
Input array of allele frequencies.
Returns:
- np.ndarray
Standardized genotypes.
Raises:
None
- xftsim.utils.to_proportions(*args)
Converts input values to proportional values.
Parameters:
- *args: Union[float, int]
Input values.
Returns:
- np.ndarray
Proportional values.
Raises:
None
- xftsim.utils.to_simplex(*args)
Converts input values to a simplex vector.
Parameters:
- *args: Union[float, int]
Input values.
Returns:
- np.ndarray
Simplex vector.
Raises:
- ValueError
If all input values are less than or equal to zero.
- xftsim.utils.unique_identifier(frame, index_variables, prefix=None)
Returns a unique identifier string generated from index variables of a dataframe.
Parameters:
- frame: pd.DataFrame
Input dataframe.
- index_variables: List[str]
List of column names to be used as index.
- prefix: str
Optional prefix
Returns:
- str
Unique identifier string of the form [<prefix>..]<index_var1>.<index_var2>…
Raises:
None