University Of California Scientist Discover Disturbing New SARS-CoV-2 Lineage Called B.1.x That has Multiple Mutations Found In Other VOCs
Researchers from the Genomics Institute of the University of California-Santa Cruz have discovered a disturbing new SARS-CoV-2 lineage called B.1.x that has multiple mutations on it that are found in other Variants of Concern (VOCs) but is not a recombinant strain itself. However due to the presence of a large deletion on its ORF8 (B.1.x contains a large deletion in ORF8 ie a 35bp deletion that induces a frameshift and early stop codon in ORF8 ie bases 27922-27956 of RefSeq NC_045512.2, automated sequence QC tools reject B.1.x submission to databases. In submitting the sequences, the study team found that all eight genomes corresponding to the B.1.x lineage were initially rejected by both GISAID and Genbank due to the frameshift-inducing deletion of 35 bases in ORF8.
The study team reports that the new SARS-CoV-2 lineage shares N501Y, P681H, and other mutations with known variants of concern, such as B.1.1.7.
This B.1.x Lineage
(COG-UK sometimes references similar samples as B.1.324.1) is present in at least 20 states across the USA and in at least six countries. However, a large deletion causes the sequence to be automatically rejected from repositories, suggesting that the frequency of this new lineage is underestimated using public data.
Disturbingly recent dynamics based on 339 samples obtained in Santa Cruz County, CA, USA suggest that B.1.x may be increasing in frequency at a rate similar to that of B.1.1.7 in Southern California. At present the functional differences between this variant B.1.x and other circulating SARS-CoV-2 variants are unknown, and further studies on secondary attack rates, viral loads, immune evasion and/or disease severity are needed to determine if it poses a public health concern.
Nonetheless, given what is known from well-studied circulating variants of concern, it seems unlikely that the lineage could pose larger concerns for human health than many already globally distributed lineages. The study highlights a need for rapid turnaround time from sequence generation to submission and improved sequence quality control that removes submission bias.
The study findings were published on a preprint server and are currently being peer reviewed. https://www.biorxiv.org/content/10.1101/2021.04.05.438352v1
The new B.1.X lineage shares mutations with known VOC including several mutations shared with B.1.1.7 and other VOCs. The eight samples in the study shared six non-synonymous mutations within the S or Spike protein relative to the Wuhan-Hu-1 SARS-CoV-2 reference sequence (RefSeq NC_045512.2).
More specifically, each sample contains Spike mutations S494P, N501Y, D614G, P681H, K854N, and E1111K.
Among these, three mutations are suspected to have an effect on the viral fitness and transmissibility. Specifically, N501Y is thought to be important for viral replication because it enables the virus to bind ACE2 and enter host cells more efficiently.
The S494P mutation s also located within the ACE2 receptor binding domain and experimental evidence suggests that mutations at this position decrease antibody binding affinity. Similarly, P681H is located within the spike protein furin cleavage s
ite which is thought to be a hotspot of viral adaptive evolution. D614G became globally dominant in 2020 possibly due to higher viral loads. The effect of the other two Spike mutations is unknown.
Furthermore in addition to the Spike mutations, B.1.x includes N:M234I (G28975A), which also appears in Variants of Interest B.1.526 and P.2 (G28975T). The three nucleotide mutations that cause N:M234I (G28975A, G28975C, G28975T)
have all been observed at the roots of several Pango lineages and the frequency of N:M234I in 480,704 samples available from GISAID as of 2 April 2021 with collection dates 2021-01-01 to 2021-03-31 is 7.0%. N:M234I has been predicted to be stabilizing for the protein structure.
More generally, because each of these mutations appear to have occurred independently from other VOCs in B.1.x, these substitutions reveal significant evolutionary parallelism between B.1.x and known VOCs.
The B.1.x lineage contains a large deletion in ORF8: B.1.x contains a 35bp deletion that induces a frameshift and early stop codon in ORF8 (bases 27922-27956 of RefSeq NC_045512.2) that is reminiscent of B.1.1.7, which also contains a nonsense mutation that causes a premature stop codon in ORF8.
To date the functional significance of this mutation is not known. This parallel evolution with the B.1.1.7 lineage suggests that inactivation of ORF8 may be favorable to the virus, possibly in combination with shared amino acid substitutions within the Spike protein.
More concerning is the fact that automated sequence QC tools rejected B.1.x submission to databases: In submitting these sequences, the study team found that all eight genomes corresponding to the B.1.x lineage were initially rejected by both GISAID and Genbank due to the frameshift-inducing deletion of 35 bases in ORF8.
In order to get them accepted required a few iterations of manual curation and sequence confirmation. An informal poll of other groups submitting sequences to these databases recently indicated that many are too busy to engage in this process and are setting aside genomes that do not initially pass automated sequence checks!
Alarmingly this means it is possible that sequences belonging to this lineage are underreported in GISAID and Genbank. Although such automated checks are clearly essential for maintaining high quality sequence databases, they may also result in a significant bias in the SARS-CoV-2 variants that are present in the database.
The study team suggests that rapid phylogenetic placement using e.g., UShER, during sequence submission might allow closely related sequences with novel variation to corroborate each other during batch submissions and across independent sequences, thus alleviating some of this submission burden. It is critical that these issues with GISAID and Genbank submission systems be improved, as they could hamper our ability to recognize and evaluate new variants of concern as they arise.
For the latest on New Emerging SARS-CoV-2 Variants
, keep on logging to Thailand Medical News.