damuta.constants module

Constants and mutation type definitions for DAMUTA.

This module defines standard mutation type nomenclatures used throughout DAMUTA, including COSMIC-style mutation signatures and various groupings of mutation types for different modeling approaches.

COSMIC_artifact = ['SBS27', 'SBS43', 'SBS45', 'SBS46', 'SBS47', 'SBS48', 'SBS49', 'SBS50', 'SBS51', 'SBS52', 'SBS53', 'SBS54', 'SBS55', 'SBS56', 'SBS57', 'SBS58', 'SBS59', 'SBS60']

COSMIC Single Base Substitution signatures identified as sequencing/processing artifacts.

Type:

list of str

idx96 = MultiIndex([('C>A', 'ACA'),             ('C>A', 'ACC'),             ('C>A', 'ACG'),             ('C>A', 'ACT'),             ('C>A', 'CCA'),             ('C>A', 'CCC'),             ('C>A', 'CCG'),             ('C>A', 'CCT'),             ('C>A', 'GCA'),             ('C>A', 'GCC'),             ('C>A', 'GCG'),             ('C>A', 'GCT'),             ('C>A', 'TCA'),             ('C>A', 'TCC'),             ('C>A', 'TCG'),             ('C>A', 'TCT'),             ('C>G', 'ACA'),             ('C>G', 'ACC'),             ('C>G', 'ACG'),             ('C>G', 'ACT'),             ('C>G', 'CCA'),             ('C>G', 'CCC'),             ('C>G', 'CCG'),             ('C>G', 'CCT'),             ('C>G', 'GCA'),             ('C>G', 'GCC'),             ('C>G', 'GCG'),             ('C>G', 'GCT'),             ('C>G', 'TCA'),             ('C>G', 'TCC'),             ('C>G', 'TCG'),             ('C>G', 'TCT'),             ('C>T', 'ACA'),             ('C>T', 'ACC'),             ('C>T', 'ACG'),             ('C>T', 'ACT'),             ('C>T', 'CCA'),             ('C>T', 'CCC'),             ('C>T', 'CCG'),             ('C>T', 'CCT'),             ('C>T', 'GCA'),             ('C>T', 'GCC'),             ('C>T', 'GCG'),             ('C>T', 'GCT'),             ('C>T', 'TCA'),             ('C>T', 'TCC'),             ('C>T', 'TCG'),             ('C>T', 'TCT'),             ('T>A', 'ATA'),             ('T>A', 'ATC'),             ('T>A', 'ATG'),             ('T>A', 'ATT'),             ('T>A', 'CTA'),             ('T>A', 'CTC'),             ('T>A', 'CTG'),             ('T>A', 'CTT'),             ('T>A', 'GTA'),             ('T>A', 'GTC'),             ('T>A', 'GTG'),             ('T>A', 'GTT'),             ('T>A', 'TTA'),             ('T>A', 'TTC'),             ('T>A', 'TTG'),             ('T>A', 'TTT'),             ('T>C', 'ATA'),             ('T>C', 'ATC'),             ('T>C', 'ATG'),             ('T>C', 'ATT'),             ('T>C', 'CTA'),             ('T>C', 'CTC'),             ('T>C', 'CTG'),             ('T>C', 'CTT'),             ('T>C', 'GTA'),             ('T>C', 'GTC'),             ('T>C', 'GTG'),             ('T>C', 'GTT'),             ('T>C', 'TTA'),             ('T>C', 'TTC'),             ('T>C', 'TTG'),             ('T>C', 'TTT'),             ('T>G', 'ATA'),             ('T>G', 'ATC'),             ('T>G', 'ATG'),             ('T>G', 'ATT'),             ('T>G', 'CTA'),             ('T>G', 'CTC'),             ('T>G', 'CTG'),             ('T>G', 'CTT'),             ('T>G', 'GTA'),             ('T>G', 'GTC'),             ('T>G', 'GTG'),             ('T>G', 'GTT'),             ('T>G', 'TTA'),             ('T>G', 'TTC'),             ('T>G', 'TTG'),             ('T>G', 'TTT')],            names=['Type', 'Subtype'])

Hierarchical index for 96 mutation types grouped by substitution type and trinucleotide context.

Type:

pd.MultiIndex

mut16 = ['A_A', 'A_C', 'A_G', 'A_T', 'C_A', 'C_C', 'C_G', 'C_T', 'G_A', 'G_C', 'G_G', 'G_T', 'T_A', 'T_C', 'T_G', 'T_T']

16 dinucleotide contexts.

Type:

list of str

mut32 = ['ACA', 'ACC', 'ACG', 'ACT', 'CCA', 'CCC', 'CCG', 'CCT', 'GCA', 'GCC', 'GCG', 'GCT', 'TCA', 'TCC', 'TCG', 'TCT', 'ATA', 'ATC', 'ATG', 'ATT', 'CTA', 'CTC', 'CTG', 'CTT', 'GTA', 'GTC', 'GTG', 'GTT', 'TTA', 'TTC', 'TTG', 'TTT']

32 trinucleotide contexts (damage signature mutation types).

Type:

list of str

mut6 = ['C>A', 'C>G', 'C>T', 'T>A', 'T>C', 'T>G']

6 substitution types (misrepair signature mutation types).

Type:

list of str

mut96 = ['A[C>A]A', 'A[C>A]C', 'A[C>A]G', 'A[C>A]T', 'C[C>A]A', 'C[C>A]C', 'C[C>A]G', 'C[C>A]T', 'G[C>A]A', 'G[C>A]C', 'G[C>A]G', 'G[C>A]T', 'T[C>A]A', 'T[C>A]C', 'T[C>A]G', 'T[C>A]T', 'A[C>G]A', 'A[C>G]C', 'A[C>G]G', 'A[C>G]T', 'C[C>G]A', 'C[C>G]C', 'C[C>G]G', 'C[C>G]T', 'G[C>G]A', 'G[C>G]C', 'G[C>G]G', 'G[C>G]T', 'T[C>G]A', 'T[C>G]C', 'T[C>G]G', 'T[C>G]T', 'A[C>T]A', 'A[C>T]C', 'A[C>T]G', 'A[C>T]T', 'C[C>T]A', 'C[C>T]C', 'C[C>T]G', 'C[C>T]T', 'G[C>T]A', 'G[C>T]C', 'G[C>T]G', 'G[C>T]T', 'T[C>T]A', 'T[C>T]C', 'T[C>T]G', 'T[C>T]T', 'A[T>A]A', 'A[T>A]C', 'A[T>A]G', 'A[T>A]T', 'C[T>A]A', 'C[T>A]C', 'C[T>A]G', 'C[T>A]T', 'G[T>A]A', 'G[T>A]C', 'G[T>A]G', 'G[T>A]T', 'T[T>A]A', 'T[T>A]C', 'T[T>A]G', 'T[T>A]T', 'A[T>C]A', 'A[T>C]C', 'A[T>C]G', 'A[T>C]T', 'C[T>C]A', 'C[T>C]C', 'C[T>C]G', 'C[T>C]T', 'G[T>C]A', 'G[T>C]C', 'G[T>C]G', 'G[T>C]T', 'T[T>C]A', 'T[T>C]C', 'T[T>C]G', 'T[T>C]T', 'A[T>G]A', 'A[T>G]C', 'A[T>G]G', 'A[T>G]T', 'C[T>G]A', 'C[T>G]C', 'C[T>G]G', 'C[T>G]T', 'G[T>G]A', 'G[T>G]C', 'G[T>G]G', 'G[T>G]T', 'T[T>G]A', 'T[T>G]C', 'T[T>G]G', 'T[T>G]T']

96 COSMIC-style mutation types in format X[Y>Z]W where X and W are flanking bases.

Type:

list of str