Researchers have discovered a new “hidden” gene in the novel coronavirus which may contribute to its unique biology and pandemic potential, an advance that may lead to the development of new therapeutics against the deadly virus. According to the scientists, including those from the American Museum of Natural History in the US, knowing more about the 15 genes that make up the coronavirus genome could have a significant impact on developing drugs and vaccines to combat the virus.
In the current study, published in the journal eLife, the researchers described overlapping genes — or “genes within genes” — in the virus which they believe play a role in the replication of the virus within host cells. “Overlapping genes may be one of an arsenal of ways in which coronaviruses have evolved to replicate efficiently, thwart host immunity, or get themselves transmitted,” said study lead author Chase Nelson from the American Museum of Natural History.
“Knowing that overlapping genes exist and how they function may reveal new avenues for coronavirus control, for example through antiviral drugs,” Nelson added.
The research team identified a new overlapping gene — ORF3d — in the novel coronavirus SARS-CoV-2 that has the potential to encode a protein that is longer than expected. They said ORF3d is also present in a previously discovered pangolin coronavirus, indicating the gene may have undergone changes during the evolution of SARS-CoV-2 and related viruses. According to the study, ORF3d has been independently identified and shown to elicit a strong antibody response in Covid-19 patients, demonstrating that the protein produced from the new gene is manufactured during human infection.
“We don’t yet know its function or if there’s clinical significance. But we predict this gene is relatively unlikely to be detected by a T-cell response, in contrast to the antibody response. And maybe that has something to do with how the gene was able to arise,” Nelson said.
The scientists explained that genes in coronaviruses can seem like written language in that they are made of strings of chemical base molecules Adenine, Guanine, Uracil and Cytosine, represented by the letters A, G, U and C respectively. They explained that these letters act as an information code for the synthesis of proteins within cells. But while the units of language (words) are discrete and non-overlapping, the researchers said genes can be overlapping and multi-functional, with information cryptically encoded depending on where you start “reading.” While overlapping genes are hard to spot, and most scientific computer programs are not designed to find them, the scientists said they are common in viruses. This is partly because RNA viruses have a high mutation rate, so they tend to keep their gene count low to prevent a large number of mutations, they explained.
The researchers noted that viruses have evolved a “sort of data compression system” in which one letter in its genome can contribute to two or even three different genes.
“Missing overlapping genes puts us in peril of overlooking important aspects of viral biology,” said Nelson. “In terms of genome size, SARS-CoV-2 and its relatives are among the longest RNA viruses that exist. They are thus perhaps more prone to ‘genomic trickery’ than other RNA viruses,” he added.