COVID-19 Terminology Part 1: What Is A Clade And What Are The SARS-CoV-2 Clades?
: We at Thailand Medical News found that many of our readers had problems understanding certain terminology used often in various articles including terms like mutations, isolates, variants, strains, antigens, antibodies etc and have decided to start a new series to explain certain terminology to readers. As COVID-19 is expected to be with us for the next 10 years of so (forget about the any vaccines ..the least they can do is offer short-term protection but at a long term costs…and the vaccines will not be able to eradicate the COVID-19 pandemic.), it will be useful to know and understand all this terminology used often.
The SARS-CoV-2 coronavirus , since its debut in December 2019 in Wuhan, Hubei province, China, has quickly begun to mutate. Mutations are common in coronaviruses, and SARS-CoV-2 has been found to have many different clades.
Importantly, these clades can give scientists information on where certain strains of the virus are concentrated, and how these different clades may impact the virulence of SARS-CoV-2, the speed of the disease spread, and its resistance to antiviral medications.
Basically a clade is a term for a group of organisms that all originate from a common ancestor, and is widely used in biology. Using phylogeny, which is the evolutionary history of a group of organisms, the development of changes in a set of descendant organisms can be tracked.
In the field of virology, a clade describes groups of similar viruses based on their genetic sequences, and changes in those viruses can also be tracked using phylogeny. Rapid genome sequencing is the method by which developments in a virus’s genomic makeup can be tracked.
The SARS-CoV-2 coronavirus is itself a clade within the virus family coronaviridae and the genus betacoronavirus. Generally, the genetic variations of a virus are grouped into clades, which can also be called subtypes, genotypes, or groups.
The process of rapid genome sequencing can help to quickly work out where an individual has become infected with a certain clade of SARS-CoV-2. For instance, the first four cases of COVID-19 in New South Wales, Australia, were found to be closely related to the dominant strain of SARS-CoV-2 found in Wuhan, and these first four cases were all in people who had recently returned from traveling in China. This meant that travel could then be restricted between China and Australia to limit the numbers of infected people traveling to and from the two countries.
It was mentioned in a study published in the journal: Virus Evolution
in April 2020, cases were also tracked from Australia to Iran, where it was found that the genomes were all from one monophyletic group characterized by three nucleotide substitutions in the SARS-CoV-2 genome, when compared to the prototype strain from Wuhan. https://academic.oup.com/ve/article/6/1/veaa027/5818738
As of November 14, 2020, there were over 201,306 complete and high-coverage genomes available on the Global Initiative on Sharing Avian Influenza Data (GISAID). https://www.gisaid.org/
A research paper published in the <
;em>International Journal of Infectious Diseases (IJID) on 22 August 2020, found that there were five clades of SARS-CoV-2 that were characterized by 11 major mutations worldwide. https://www.sciencedirect.com/science/article/pii/S1201971220306810
There was an increased dominance of one or two clades in each geographic location included in the study.
The five clades were: D392, G614, I378, S84 and V251.
Note that each of these clades can also have a variety of mutations on them.
Also there are new clades and also mutations being discovered that have not been included here as of date.
The Clade G614
was most widely spread in Europe and North America after being brought into the continents by people traveling from China, and the dominance of clade G614
could be due to the increased longevity of the virus that this particular mutation causes.
Importantly the majority of the genomes that have not been categorized in a major clade by the study are found in Asia and were detected early in the pandemic.
It is important however to note that this study was limited by the range of genomic data available, which only came from certain regions. As a result, new clades may become apparent in the future as more geographical regions make genomic data available.
A different study carried out by the World Health Organization (WHO) showed how the SARS-CoV-2 genome has evolved as it has spread across the world. https://www.who.int/bulletin/volumes/98/7/20-253591/en/
Strangely this study did not show how these evolutions changed the virulence of the virus, but it did show that the most common SARS-CoV-2 clade was the D614G variant within the six clades and 14 subclades it identified.
The research included 10,022 SARS-CoV-2 genomes from 68 different countries. In total, WHO detected 65,776 variants and 5,775 distinct variants, which comprised:
2,969 missense mutations
1,965 synonymous mutations
484 mutations in non-coding regions
142 non-coding deletions
100 in-frame deletions
66 non-coding insertions
36 stop-gained variants
11 frameshift deletions
Two in-frame insertions.
Importantly, the D614G variant is located in the B-cell epitope and has been found to have a very immunodominant region, which may affect how well a vaccine may work against it.
Also the largest clade found in the WHO study was D614G, which has five subclades associated with it. The non-coding variant 241C > T, along with 3037C > T, and ORF1ab P4715L were found in most of the samples in the D614G clade.
Also, almost every strain that had a D614G mutation featured mutations in proteins that control viral replication, which has implications for how quickly the virus can multiply. This particular protein is what anti-viral drugs remdesivir and favipiravir target. It could be possible that strains of SARS-CoV-2 quickly become resistant to drugs that target these proteins.
Next, the second-largest clade identified by the WHO study was L84S, which comprises two subclades. L84S was a clade found in people traveling from Wuhan in the early phases of the SARS-CoV-2 outbreak.
Importantly as mentioned in the August study in the International Journal of Infectious Diseases
(IJID ), it is stated that while genome replication is occurring in the host, SARS-CoV-2 undergoes genome mutations that can be passed on to descendent genomes and new hosts as it is spread between people.
To date, the SARS-CoV-2 pandemic has affected 188 countries in every continent except Antarctica.
The A2a clade, which was introduced into New York through Europe and Italy, is concentrated on the East Coast of the USA. B1
clade predominates on the West Coast of the USA. G614
clade has become widespread globally, while at the early stages of the pandemic, S84 was the predominant clade in Asia when unassigned genomes are excluded.
Data in the WHO investigation on the locations in which certain clades and subclades are common includes:
L84S/p5828L/ subclade in the USA
G251V in the UK, Australia, USA, and Iceland.
The SARS-CoV-2 coronavirus has proven to be a genetically diverse virus that has now become endemic within humans. Identifying and tracking clades that become dominant in certain geographical regions may inform the development of effective vaccination, as certain anti-viral drugs may not work against mutated forms of the virus, or clades of the virus that have developed resistance through genomic mutation.
To date there are so many new mutations and variants emerging and only a few have been properly identified, some belong to these existing clades and some not. Some of these new emerging strains are not only resistant to antibodies and drugs, but some even have newer modes of evading the human host immune system.
For more on COVID-19 Terminology
and background information, keep on logging to Thailand Medical news.