Time to trash ‘junk DNA’

Having to read Sean Carroll’s ‘The Making of the Fittest’ as part of an evolution course I’m taking this semester, a concept that struck my attention was ‘junk DNA’, introduced along with the imagery of genes as archipelagos of information amongst a sea of junk within the genome. Although this contemporary science book has been fascinating thus far, it was published in 2006 and is quickly becoming outdated in sections due to the fast-paced field of genetics. More and more scrutiny is being applied to the nature of the dogmatically termed junk-DNA, which is exemplified by a recent study by Steven Xijin Ge, entitled “Exploratory bioinformatics investigation reveals importance of ‘junk’ DNA in early embryo development[1]. As the title says, this article is an exploration into the role that junk DNA has on different planes of focus; during early development or as a result of evolution over deep time. 

In the coming century, our most prolific weapon for attacking scientific uncertainties will be our computing power when analysing “big data[1]. While scientific investigations have conventionally been hypothesis drive, the authors of this paper have employed an emerging line of scientific investigation, undertaking a bioinformatics-based exploratory data analysis (EDA). This is a philosophically distinct statistical investigation, in that no hypotheses are being proposed, with their probability of exactness being ultimately calculated, but rather revealing trends and characteristics in the data serendipitously. A highlighted benefit of this approach is that open-mindedness is a valuable scientific trait, minimising preconceived biases that could affect results or the path of investigation.   

The authors sought to analyse a fundamental biological process for which we have limited and incomplete information about; early development. Complex genomic data relating to murine morphogenesis was studied using single cell RNA-seq on 259 nascent cells, reflecting the spatial and temporal gene expression patterns during embryonic developmental. 

At the point of maternal-to-zygotic transition (MZT), a range of epigenetic factors are removed and the zygote genome takes control, creating a highly complicated, multiplexed shift in expression regulation. The genetic elements responsible include transposons, Long Terminal Repeats (LTRs) and SINEs (Short Interspersed Nuclear Elements). Corresponding transposons have been seen to concentrate at the promoter regions of genes with similar intended expression patterns, to establish early expression landscapes within the zygote. LTRs and SINEs are linked to induction and upregulation of genes, respectively.

A fair amount of contention surrounds the classification and extent of ‘junk DNA’ that is contained within an organism’s genome, for humans especially. The ENCODE Consortium, a big science project that has catalogued an Encyclopaedia of DNA Elements, stunned the scientific community when attributing genetic functionality to ~80% of the entire genome[2]. This strongly refuted the conventional notion that the majority of the genome is nonfunctional junk.  

This paper uncovers a revelation to an argument that is frequently proposed in favour of the existence of vast regions of junk DNA, referred to as the ‘C-value paradox‘ (the enormously varying genome size between species without correlating to organism complexity)[3]. In nature, the reproductive cycles of organisms can range from 20 minutes to 20 years, creating a significant imbalance between adaptive dynamics and, therefore, rate of evolution. Slow reproducing organisms must find alternative strategies to promote genetic diversity and drive their natural selection. This paper shows there is a linear correlation between generation time (linearised) and genome size, which is a proxy for the volume of reputed junk DNA.  

C Value
Correlation between genome size and generation time across a subset of model organisms[1]
A source of this genetic diversity bulking up the genome are transposable elements (transposons), genetic elements capable of translocating themselves from one point in the genome to another. They provide a form of insertional mutagenesis, accomplished through homologous recombination. Although largely being dismissed at junk DNA since their discovery in the 1940s by Barbara McClintock[4], the evolutionary advantage that they possess is slowly becoming clearer, with specific transposons seemingly becoming more prominent through evolutionary processes. Studies like this are highlighting the naivety of our terminology, with functionally unattributed regions shaping up to be the drivers of genome evolution in the future, a conclusion profoundly different to their consideration as junk. 



1.Ge, S.X., Exploratory bioinformatics investigation reveals importance of “junk” DNA in early embryo development. BMC Genomics, 2017. 18(200). 

2.Ecker, J.R., et al., Genomics: ENCODE explained. Nature Education, 2012. 489(7414). 

3.Doolittle, W.F., Is junk DNA bunk? A critique of ENCODE. PNAS, 2013. 110(14). 

4.Pray, L., Transposons, or Jumping Genes: Not Junk DNA? Nature Education, 2008. 1(32)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: