API reference

Processing VCF

class dismal.callset.CallSet(vcf_path: str = None, npz_path: str = None)

Class to represent a skallel callset (i.e. a VCF)

__init__(vcf_path: str = None, npz_path: str = None) None

Represent VCF as skallel callset

Parameters
  • vcf_path (str, optional) – Path to VCF, defaults to None

  • npz_path (str, optional) – Output path, defaults to None

Making blocks

dismal.blocking.blocklen_using_dxy(callset: CallSet, sample_map: dict, numerator: int = 3) float

Rule-of-thumb blocklength estimation using 3/dxy rule.

Parameters
  • callset (CallSet) – Call data.

  • sample_map (dict) – Sample:population map.

  • numerator (int, optional) – Desired mean of between-distribution, defaults to 3

Returns

Recommended blocklength.

Return type

float

dismal.blocking.make_blocks(block_size: int, annotation: pyranges.pyranges_main.PyRanges | str = None, callable_sites: pyranges.pyranges_main.PyRanges | str = None, features: list[str] = None, trim_start: int = 10, trim_end: int = 10) PyRanges

Make blocks with respect to an annotation and/or callable sites.

Parameters
  • block_size (int) – Desired length of genomic blocks.

  • annotation (PyRanges | str, optional) – Genome annotation, defaults to None

  • callable_sites (PyRanges | str, optional) – Callable sites, defaults to None

  • features (list[str], optional) – Features to subset for. Will be removed in a future version., defaults to None

  • trim_start (int, optional) – Length of sequence to trim, defaults to 10

  • trim_end (int, optional) – Length of sequence to trim, defaults to 10

Raises

ValueError – If not at least one of callable_sites or annotation are provided.

Returns

Blocks

Return type

PyRanges

dismal.blocking.make_random_blocks(callset: CallSet, block_size: int, chrom_sizes: dict = None, blocks_per_pair: int = 1000) PyRanges

Generate random blocks without considering annotation or coverage (e.g. if data is pre-filtered, or simulated).

Parameters
  • callset (CallSet) – Call data.

  • block_size (int) – Size of genomic blocks.

  • chrom_sizes (dict, optional) – Chromosome sizes, defaults to None in which case first and last positions are used.

  • blocks_per_pair (int, optional) – Number of blocks to make per pair of individuals, defaults to 1000

Returns

Block coordinates

Return type

PyRanges

dismal.blocking.segregating_sites_distribution(blocks: pyranges.pyranges_main.PyRanges | str, sample_map: dict, save_blocks_bed: str = 'blocks_with_state.bed', save_distr_npz: str = 's_distr.npz') tuple[numpy.array]

Compute segregating sites distributions within and between populations.

Parameters
  • blocks (PyRanges | str) – Blocks

  • sample_map (dict) – Map sample:population

  • save_blocks_bed (str, optional) – Path to save blocks to, defaults to “blocks_with_state.bed”

  • save_distr_npz (str, optional) – Path to save distributions to, defaults to “s_distr.npz”

Returns

Distributions

Return type

tuple[np.array]

Using pre-defined models

dismal.models.gim(sampled_deme_names=None, asymmetric_migration=True)

Create three-epoch GIM model (allow migration post-split) using default names

dismal.models.iim(sampled_deme_names=None, asymmetric_migration=True)

Create three-epoch isolation-with-initial-migration model (migration only in middle epoch) with default names

dismal.models.im(sampled_deme_names=None, asymmetric_migration=True)

Create two-epoch isolation-with-migration model with default names.

dismal.models.iso_three_epoch(sampled_deme_names=None)

Create three-epoch isolation model (no migration) with default names

dismal.models.iso_two_epoch(sampled_deme_names=None)

Create two-epoch isolation model with default names

dismal.models.secondary_contact(sampled_deme_names=None, asymmetric_migration=True)

Create three-epoch secondary contact model (migration only in most recent epoch) with default names

Automatically fitting multiple models with MultiModel

class dismal.multimodel.MultiModel(s1: list[int], s2: list[int], s3: list[int], sampled_deme_names: tuple[str], max_epochs: int = 3, threads: int = 1)
__init__(s1: list[int], s2: list[int], s3: list[int], sampled_deme_names: tuple[str], max_epochs: int = 3, threads: int = 1) None

Class to fit and represent multiple models fitted on the same data.

Parameters
  • s1 (list[int]) – Distribution of segregating sites within population 1.

  • s2 (list[int]) – Distribution of segregating sites within population 2.

  • s3 (list[int]) – Distribution of segregating sites between populations.

  • sampled_deme_names (tuple[str]) – Names of sampled (current) populations.

  • max_epochs (int, optional) – Maximum number of epochs (including ancestral population) to consider, defaults to 3

  • threads (int, optional) – Number of threads to use; 0 = all, -1 = all but one, etc, defaults to 1

static likelihood_ratio_test(null_mod: DemographicModel, alt_mod: DemographicModel, alpha: float = 0.05, verbose: bool = True) tuple[float]

Likelihood ratio test between two fitted models. Does not adjust for composite likelihood non-independence.

Parameters
  • null_mod (DemographicModel) – Null model - must be nested within alternate model.

  • alt_mod (DemographicModel) – Model to test against null.

  • alpha (float, optional) – Alpha parameter for significance, defaults to 0.05

  • verbose (bool, optional) – Verbosity, defaults to True

Returns

(likelihood ratio, p-value)

Return type

tuple[float]

Specifying custom models

class dismal.demographicmodel.DemographicModel(model_ref: str = None)
__init__(model_ref: str = None) None

Represent single demographic model.

Parameters

model_ref (str, optional) – Model name, defaults to None

add_epoch(n_demes: int, migration: bool, deme_ids: list[tuple[str]] = None, asymmetric_migration: bool = True, migration_direction: list[tuple[str]] = None) None

Add an epoch

Parameters
  • n_demes (int) – Number of demes in epoch

  • migration (bool) – Whether to allow migration

  • deme_ids (list[tuple[str]], optional) – Deme names, defaults to None

  • asymmetric_migration (bool, optional) – Whether to allow asymmetric migration, defaults to True

  • migration_direction (list[tuple[str]], optional) – Direction of migration e.g. (“A”, “B”) for backwards-in-time migration A->B, defaults to None

bootstrap_mle(mutation_rate: float, blocklen: int, recombination_rate: float = 0, n_bootstraps: int = 100) array

_summary_

Parameters
  • mutation_rate (float) – Mutation rate

  • recombination_rate (float) – Recombination rate (note that original model was fitted assuming no recombination)

  • blocklen (int) – Block length

  • n_bootstraps (int, optional) – Number of bootstrap replicates, defaults to 100

Returns

Bootstrap values

Return type

np.array

demes_format(mutation_rate, blocklen, log_time=True)

Represent model in Demes format - to be removed in next refactor

demesdraw(mutation_rate=None, blocklen=None, log_time=True)

Draw Demes model; convenience function for demes_format().drawing - to be removed in next refactor

fit_model(s1: array, s2: array, s3: array, initial_values: list = None, bounds: list[tuple] = None, optimisers: list = None) None

Fit model to estimate parameters

Parameters
  • s1 (np.array) – Segregating sites distribution within population 1

  • s2 (np.array) – Segregating sites distribution within population 2

  • s3 (np.array) – Segregating sites distribution between populations

  • initial_values (list, optional) – Initial values for optimisation, defaults to None in which case defaults are used

  • bounds (list[tuple], optional) – Bounds for parameters (low, high), defaults to None

  • optimisers (list, optional) – Optimisation algorithms to use, defaults to None in which case L-BFGS-B and Nelder-Mead are tried sequentially

Raises

RuntimeError – If all optimisers fail

kldiv_fitted_observed()

Evaluate model fit (KL-divergence) against observed data

kldiv_fitted_true(true_modinst: ModelInstance, s_max: int = 500)

Evaluate model fit (KL-divergence) against specified parameter set ModelInstance