BiologyGeneticsGenomics DNA consists of strings of CTGA bases in a sequence. There are three key types of gene modifications: histone protein modifications, DNA methylation, and DNA methylation.

The DNA strand is wrapped around various protein bundles, in order to make chromosomes more compact. Every 200 base strings get wrapped around a nucleosome:

Nucleosome: a collection of histone proteins with amino-acid tails containing epigenetic marks, i.e. chemical alterations at various positions along the tail. Ex) H3K4me3 - refers to three methyl groups on the fourth position of the amino acid (lysine) for the H3 histone protein.

These nucleosomes are structures that work in combination with epigenetic marks, i.e. methylation on DNA substrings, that block genetic transcription called repression. Transcription factor (proteins) can more easily access DNA between nucleosomes. Gene regulation is a struggle between nucleosomes wanting to occupy DNA and transcription factors pushing nucleosomes apart.

This enables exact characterization of functionality based on the genetic signatures:

  1. Enhancers: increase the likelihood of transcription of a particular gene (can be very far from transcription location and are dynamic across cell types)
  2. Promoters: places were RNA binds to DNA to begin transcription (close to transcription substrings and stable across cell types)
  3. Transcribers: copy DNA into RNA
  4. Repressors: block transcription

A key task is automatically annotating and discovering these states within the genome.