The dictionary of life has a new update. A DNA sequence that signals cells in almost all other organisms to stop synthesising proteins instead encodes a rare amino acid in some archaea, according to a study published in Science in November.
Archaea are microbes that resemble bacteria in shape and size but are biologically distinct.
Calling the study “the first of its kind,” Bose Institute, Kolkata, biological sciences associate professor Abhrajyoti Ghosh said the discovery could help scientists engineer proteins with “functional advantages that have been hitherto unknown.” Dr. Ghosh studies how archaea respond to stress.
The study’s findings provide “yet another fantastic example of how biology hides secrets that drive biotechnology innovation,” University of California Berkeley chemistry professor and study coauthor Alanna Schepartz said in a statement.
Reading the dictionary
By the late 1960s, scientists had identified the set of rules that dictate how a sequence of DNA corresponds to the order in which amino acids are placed in proteins. These rules came to be called the genetic code.
At the heart of this code are the four nitrogen-containing bases that are a part of DNA: adenine (A), guanine (G), cytosine (C) and thymine (T). Each amino acid in a protein corresponds to a three-base-long sequence of DNA — a.k.a. a triplet codon. For example, a codon consisting of three thymines (TTT) corresponds to the amino acid phenylalanine, while TTA encodes leucine.
The genetic code is a dictionary of 64 such codons. Of these, 61 ‘sense’ codons together encode 20 common amino acids. The remaining three, called ‘stop’ codons, don’t correspond to any amino acid. Instead, when the protein-making mechanism encounters them, it terminates the protein chain.
Exceptional archaea
This code is for the most part common to all living organisms. Exceptions are rare. Some notable ones include the bacterium Mycoplasma, where the stop codon TGA encodes the amino acid tryptophan. In human beings, the same stop codon encodes the rare amino acid selenocysteine, used in a small number of proteins. In a few proteins in some archaea, the stop codon TAG doubles up as a code for another rare amino acid, pyrrolysine (Pyl).
Even in archaea where TAG is known to sometimes encode Pyl, scientists had until recently believed these organisms usually “use TAG as a stop codon, except in the very few enzymes in which Pyl occurs,” the authors of the Science study wrote in their paper.
That is now set to change. In their study, the authors reported certain archaea where the TAG codon has been completely repurposed. These organisms read the TAG codon as a signal for Pyl not occasionally but always: i.e. every time there is a TAG codon in the organisms’ DNA, they incorporate Pyl in a protein chain.
This “genome-wide incorporation of Pyl at TAG codons” has led the team to propose “the existence of a previously unrecognised genetic code,” the authors wrote. Dubbed the ‘Pyl code’ by the team, it has 62 sense codons instead of the usual 61. And they code for 21 amino acids instead of the conventional 20.
Predicting proteins
The authors used computational methods to identify nine kinds of archaea where the TAG codon appeared to have been completely repurposed to encode Pyl. From these, they chose two archaea for experiments. One of these was Methanococcoides burtonii, which grows in the extremely low temperatures of Antarctic lakes. The other was Methanomethylophilus alvi, found in the human gut.
From these archaea, the researchers extracted proteins, fragmented them, and used a technique called mass spectrometry to identify the constituent amino acids. They found 54 proteins “not previously shown to contain Pyl,” they wrote. The proteins they identified play diverse roles in these organisms, including replicating DNA and producing energy. This led the authors to conclude that “M. burtonii and M. alvi archaea have adopted a non-standard genetic code with 62 sense codons encoding 21 amino acids and only two stop codons”.
The finding might require scientists to rethink how they predict protein sequences for these organisms. Typically, predicting the protein encoded by a gene requires researchers to read codons using the standard genetic code. But now, scientists must use the Pyl code, “interpreting all TAG codons as coding for Pyl for correct protein prediction,” the authors have argued.
Bacteria as factories
The study’s potential applications involve bioengineering, where researchers can manipulate bacteria to produce useful materials. The study’s findings could help researchers “incorporate Pyl in proteins at desired positions,” Tanweer Hussain, associate professor of developmental biology and genetics at the Indian Institute of Science, Bengaluru, said. Dr. Hussain studies how organisms build protein from their DNA blueprints.
His enthusiasm may be well founded. In their study, the Berkeley researchers genetically modified Escherichia coli, a common bacterium, to express the archaeal cellular machinery required to read the Pyl code and incorporate Pyl in proteins. They also engineered the bacterium to express a protein whose sequence had a TAG codon in the middle. If this setup worked, the bacteria would read this TAG codon as Pyl-coding, add Pyl at that location, and produce the complete protein. Otherwise, the TAG codon would signal ‘stop’, and the bacteria would produce a shorter protein.
Extracts from these bacteria confirmed that they had produced the complete protein. That is: they could indeed use the archaeal machinery to produce Pyl-containing proteins.
The discovery of the Pyl code has scientists excited. Both Dr. Hussain and Dr. Ghosh are keen to know more about the role of Pyl in proteins.
“Does Pyl incorporation confer the archaea a fitness advantage in their natural environments?” Dr. Ghosh asked.
He anticipated that future research could soon offer an answer.
Sayantan Datta is a faculty member at Krea University and an independent science journalist.
