Kahani


Unraveling the Story of Proteins: The Start


When we think of proteins, a lot of us probably think about eating foods rich in protein – such as meats, fish, and legumes. However, these foods are high in amino acids – the building blocks of proteins. 

Proteins are essential to what makes us human, and today, we’ll be unraveling their story. 

It Starts with the Genome

DNA, our  genetic code, is the starting point for the creation of proteins. However, DNA cannot leave the nucleus to be translated by a ribosome (macromolecular machines found in the cell’s cytoplasm that synthesize proteins). Thus, DNA must be converted into messenger RNA (mRNA). 

Image Source: Boston University

As shown, a specific gene (which codes for a protein) will be transcribed into an mRNA strand. mRNA is single stranded, while DNA is double stranded. 

However, not all of DNA codes for a protein (in fact, only 1% codes for a gene, while the other 99% codes for regulatory sequences). In addition, there are more proteins than genes; the human genome has about 20,000 genes that code for proteins, and about 100,000 distinct proteins. This is due to the concepts of introns, exons and alternative splicing. 

Introns are parts of DNA that don’t code for a protein, while exons are sequences that do code for a protein. 

Image Source: Wikimedia Commons

When mRNA is produced, introns are usually split out, as shown above. Yet, with alternative splicing, different sets of exons could be stuck together in order to create a variety of protein sequences. 

Image: Oxford Nanopore Technologies

As shown above, there are different ways to splice RNA before it becomes mRNA. Exons could be skipped for one type of protein, and could be included in another. Additionally, introns could be retained in the final mRNA sequence (intron retention); there is currently more research being done on this, and it has been shown that intron retention is important in gene regulation and could have impacts on wellbeing and disease. 

The Process

A couple of modifications are made to mRNA to extend its life in the cytoplasm; a methyl cap is added, and a long sequence of ‘A’ nucleotide sequences (known as the poly-A tail) are added to the end of the mRNA sequence. The cap assists with mRNA binding to ribosomes, and the poly-A tail preserves the actual content sequence as in the cytoplasm, nucleobases (enzymes that degrade nucleotides) will degrade the poly-A tail instead of the actual sequence directly. 

Once in the ribosome, the mRNA will be translated into a protein. Instead of explaining that process, here’s an animation that visually shows how translation works. 

What’s interesting is how the protein folds, as protein structure determines its function. 

Image Source: Khan Academy

A peptide chain (made up of several amino acids), forms the primary structure. Hydrogen bonding between the peptide backbone causes the chain to fold in a repeating pattern, forming either an alpha helix or beta pleated sheet. After, a 3D structure takes shape due to side chain interactions; specifically, ionic bonds and disulfide bridges form between the R-groups of the protein. All proteins on Earth have tertiary structure. For a protein to have quaternary structure, protein subunits must come together. 

Because protein structure determines function, elucidating the structure of proteins is a key part of biomedical research. For example, the digestive enzyme amylase is a chain structure that allows for optimal binding to starchy substrates; meanwhile, hemoglobin contains inorganic heme groups which bind to  iron, allowing it to transfer oxygen to the rest of the body. 

Yet there are many proteins for which the structure is unknown (part of the “dark proteome”). To uncover the structure of proteins, scientists would use techniques such as X-ray crystallography (which is quite expensive)! Now, techniques using artificial intelligence and deep learning have been employed to predict the structure of unknown proteins. One of these innovations include AlphaFold 3, (developed by DeepMind, a subset of Google) which has been highly effective at predicting the structure of proteins it hasn’t seen before. Read more about it here

What Can Go Wrong?

The process to produce a protein from start to finish is incredibly complex, and as expected, things can go wrong. There can be mutations in the genetic code (some of which affect the structure of the final protein, and therefore its function), or a protein can misfold. However, we’ll touch upon that in Part 2 – Unraveling the Story of Proteins: The Result. 

Thanks for reading! 

,

Leave a Reply

Your email address will not be published. Required fields are marked *

Hi! I’m Sareena, and welcome to Kahani. Read more about me here.