COVID-19 Genetics: Human Host Mediated Virus RNA Editing By Deaminases Is The Reason Numerous SARS-CoV-2 Variants Are Emerging
Source: COVID-19 Genetics Dec 14, 2020 3 years, 11 months, 1 week, 5 hours, 29 minutes ago
COVID-19 Genetics: A new study led by researchers from the Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi-India describes the prevalence of single nucleotide variations within a single host, suggesting that this phenomenon seems to be much more common in the Indian population, and that this may lead to the emergence of many variants in the population. Importantly the variants may have different spike protein characteristics, leading to potential antibody resistance.
In their study abstract, the research team says that since its zoonotic transmission in the human host, the SARS-CoV-2 virus has infected millions and has diversified extensively.
A key feature of viral system survival is their continuous evolution and adaptation within the host. RNA editing via APOBEC and ADAR family of enzymes has been recently implicated as the major driver of intra-host variability of the SARS-CoV-2 genomes. Analysis of the intra-host single-nucleotide variations (iSNVs) in SARS-CoV-2 genomes at spatio-temporal scales can provide insights on the consequence of RNA editing on the establishment, spread and functional outcomes of the virus.
In this study, utilizing 1,347 transcriptomes of COVID-19 infected patients across various populations, the study team found variable prevalence of iSNVs with distinctly higher levels in Indian population.
The study findings also suggest that iSNVs can likely establish variants in a population. These iSNVs may also contribute to key structural and functional changes in the Spike protein that confer antibody resistance.
The study findings were published on a preprint server and are currently being peer reviewed.
https://www.biorxiv.org/content/10.1101/2020.12.09.417519v1
From the beginning of the COVID-19 pandemic, the causative pathogen, ie the SARS-CoV-2 coronavirus, has undergone several mutations. This has led to significant variability of the virus within a single host.
The SARS-CoV-2 coronavirus an RNA virus, and as such was expected to show a high rate of mutation in its genome.
Typically single-nucleotide variants (SNVs) lead to the formation of multiple quasi-species, with very similar but not identical genotypes.
Numerous mechanisms underlie this high mutational tendency, including host-mediated RNA editing by deaminases. This has been suggested to be a major cause of intra-host variability for this virus, as well.
It is known that the two important RNA editing enzymes are the apolipoprotein B mRNA editing catalytic polypeptide-like and Adenosine Deaminase RNA Specific 1 enzymes, known as APOBEC and ADAR1, respectively. They are known to be activated in innate antiviral immunity for many viruses, including the coronavirus family.
https://pubmed.ncbi.nlm.nih.gov/32596474/
https://pubmed.ncbi.nlm.nih.gov/29654310/
Although the APOBEC enzymes deaminate cytosine to uracil on single-stranded RNA, causing a C-T transition, ADARs is able to deaminate adenines to inosines on doubl
e-stranded RNA. Since inosine forms a base pair with cytosine, the next replication step will lead to guanine being incorporated in the complementary position, leading to an A-G transition.
However should these enzymes act on the negative strand of SARS-CoV-2, the resulting changes will be G-A and T-C transitions, with APOBEC and ADAR1, respectively.
Significantly, in both cases, these small changes can have far-reaching effects on the secondary structure of RNA, its regulatory regions, protein structure and function, and the interactions between the virus and its host.
Interestingly, of the many thousands of mutations that are possible, very few result in enhanced viral fitness, as by conferring immune evasion or drug resistance.
Importantly as well, considering the viral population within a host to consist of all intra-host quasispecies together, the fitness of the virus within this host will include the contribution of all the haplotypes.
This current study was aimed at exploring intra-host single-nucleotide variations (iSNVs) to understand which points within the genome contribute to intra-host viral fitness in this manner.
The study team used transcription data covering the whole of the transcribed RNA from over 1,300 viral isolates. These came from India (from multiple subgroups), China, Malaysia, Germany, the UK, and the USA. There were more than 86,000 iSNVs, at a median of 19 per sample, and they seem to be reliable.
The team found widespread evidence of RNA editing by both enzymes, with ADAR1 activity appearing excessive in some cases. A-G and T-C substitutions, caused by ADAR1, made up about 36% of all variant positions, but in relatively few samples, especially the former.
However both resulted in synonymous and missense mutations to the same extent, but C-T or G-A changes also caused stop gain mutations. The latter can lead to a change in the amount of the functional protein product synthesized from the viral genome. The iSNVs brought about amino acid changes at almost equal frequencies, but many of them caused non-synonymous mutations and stop gain mutations.
It was found that the incidence of iSNVs appears to be non-uniform with respect to the population studied. If so, these changes might reflect current editing activity within the groups studied.
Significantly Indian isolates showed a significantly greater number of iSNVs across all subpopulations than European, Chinese, or US samples, despite comparable numbers of samples from all three countries. This difference could be due to genetic variation within populations shaped partly by the positive selection pressure of a heavy viral load.
(A) Split plot depicting the distribution of iSNV sites with respect to nucleotide changes in the SARS-CoV-2 genome (n=23516) and across samples (B) Potential impact of iSNVs vis-a-vis nature of amino acid sequence change in the SARS-CoV-2 genome (n=6251) (C) Distribution of iSNVs in samples of different population cohorts. The significance of the pairwise two-sided t-tests is indicated on top (*p < 10-6, **p < 10-16, ns = non-significant) (D) Radial plot with concentric rings representing the extent of position-wise heterogeneity in samples in the global populations. iSNV frequency distribution in samples is shown for select heteroplasmic positions in frequency bins of 0.2. The colour gradient in each cell represents the percentage of samples. The outer labels denote the nucleotide change and amino acid change (italicized) with the position of change. Variations that are represented in the A2a and A3i/A4 clades have been marked (*) and (**) respectively.
Importantly, malaria-endemic regions in India show an insertion allele for APOBEC3b which reduces the incidence of severe malaria, but these regions also have negligible COVID-19 mortality. https://osf.io/6sfw8
Although the correlation of APOBEC insertion with protection still needs to be tested, this nevertheless suggests the role of such family of enzymes in evolutionary outcomes of SARS-CoV-2 infection and the burden of disease.
The study team also found the sites where iSNVs seem to lock into genomic variations that persist as viral variants over time. Most of these genetic loci seem to be common to sets of viral isolates from anywhere in the world. This could mean they are preferred sites for these RNA editing enzymes.
The team highlight that the A2a clade which is now the most common isolate, as well as the A3i/A4 clade which was unique to India and has now almost been replaced, both show variations in the frequencies of nucleotides at their defining positions. C-T or A-G mutations are seen in all defining variants of clade A2a, such as D614G.
It was also found that some of the samples of RNA seemed to be subject to a very high editing frequency, as shown by numerous allele variations at multiple sites. Analysis of these samples showed that in about a third, the change involved A-G, caused by high ADAR activity.
Additionally, the spike iSNVs seem to affect functionally important residues, causing almost 1,500 protein-coding variants.
To date, the most commonly altered variant was D614G, followed by Y91, I105, and D428. There were observable hyper-variable sites within the amino acid sequence of the spike protein, which seemed to give rise to multiple variants.
The RBM ie receptor-binding motif appeared to be hyper-variable, for instance, 11/25 hyper-edited samples showed changes in specific residues at three particular positions. These could lead to antigenic changes, adaptations in biological function, and mutational escape from antibody neutralization.
Importantly such changes have been observed in the case of an immunocompromised individual in whom the virus lingered for months, undergoing numerous mutations to produce a host of non-synonymous changes. These heavily favored the spike protein and the receptor-binding domain (RBD), which accounted for 57% and 38% of the changes though they make up only 13% and 2% of the genome, respectively.
The study team said “These observations substantiate that editing within hosts may lead to an evolved immune escape ability in some strains which may seem to be a case of reinfection in a host after weeks or months of the first incidence.”
Also important to note was other sites of alteration that have been related to antibody resistance and immune escape, such as Q493K and N493K, respectively.
The study findings underline the existence of extensive cross-talk between the virus and the host cell. The effects of one such interaction, mediated by RNA editors, include rapid and functionally important changes in the SARS-CoV-2 genome.
The research also highlights the need for capturing iSNVs to enable more accurate models for molecular epidemiology as well as for diagnostics and vaccine design.
The study team said, “In conclusion, temporally tracking within-host variability of the virus in individuals and populations might provide important leads to the sites that are favourable or deleterious for virus survival. This information would be of enormous utility for diagnostics, design of vaccines as well as predicting the spread and infectivity of viral strains in the population. Conjoint analysis with the host variability in editing machinery should be the next step.”
For more on
COVID-19 Genetics, keep on logging to Thailand Medical News.