36) (RDP, GENECONV, MaxChi, Bootscan, SisScan and 3SEQ) and considered recombination signals detected by more than two methods for breakpoint identification. Among the 68sequences in the aligned sarbecovirus sequence set, 67 show evidence of mosaicism (all DunnSidak-corrected P<4104 and 3SEQ14), indicating involvement in homologous recombination either directly with identifiable parentals or in their deeper shared evolutionary historythat is, due to shared ancestral recombination events. The red and blue boxplots represent the divergence time estimates for SARS-CoV-2 (red) and the 2002-2003 SARS-CoV (blue) from their most closely related bat virus, with the light- and dark-colored versions based on the HCoV-OC43 and MERS-CoV centered priors, respectively. Li, Q. et al. We used TreeAnnotator to summarize posterior tree distributions and annotated the estimated values to a maximum clade credibility tree, which was visualized using FigTree. eLife 7, e31257 (2018). These datasets were subjected to the same recombination masking approach as NRA3 and were characterized by a strong temporal signal (Fig. Methods Ecol. All sequence data analysed in this manuscript are available at https://github.com/plemey/SARSCoV2origins. Evol. SARS-CoV-2 and RaTG13 are the most closely related (their most recent common ancestor nodes denoted by green circles), except in the 222-nt variable-loop region of the C-terminal domain (bar graphs at bottom). A deep dive into the genetics of the novel coronavirus shows it seems to have spent some time infecting both bats and pangolins before it jumped into humans, researchers said . One geographic clade includes viruses from provinces in southern China (Guangxi, Yunnan, Guizhou and Guangdong), with its major sister clade consisting of viruses from provinces in northern China (Shanxi, Henan, Hebei and Jilin) as well as Hubei Province in central China and Shaanxi Province in northwestern China. Biol. Researchers have found that SARS-CoV-2 in humans shares about 90.3% of its genome sequence with a coronavirus found in pangolins (Cyranoski, 2020).
Coronavirus origins: genome analysis suggests two viruses may have combined Even before the COVID-19 pandemic, pangolins have been making headlines.
Meet the people who warn the world about new covid variants [12]
Current Overview on Disease and Health Research Vol. 6 Coronavirus: Pangolins may have spread the disease to humans Yu, H. et al. All four of these breakpoints were also identified with the tree-based recombination detection method GARD35. These residues are also in the Pangolin Guangdong 2019 sequence. Coronavirus Disease 2019 (COVID-19) Situation Report 51 (World Health Organization, 2020). Evol.
SARS-CoV-2 Variant Classifications and Definitions Why Can't We Just Call BA.2 Omicron? - The Atlantic is funded by The National Natural Science Foundation of China Excellent Young Scientists Fund (Hong Kong and Macau; no.
Did Pangolin Trafficking Cause the Coronavirus Pandemic? (2020) with additional (and higher quality) snake coding sequence data and several miscellaneous eukaryotes with low genomic GC content failed to find any meaningful clustering of the SARS-CoV-2 with snake genomes (a). The consistency of the posterior rates for the different prior means also implies that the data do contribute to the evolutionary rate estimate, despite the fact that a temporal signal was visually not apparent (Extended Data Fig.
Coronavirus: Pangolins found to carry related strains - BBC News Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Biol.
Use of Genomics to Track Coronavirus Disease Outbreaks, New Zealand Yres, D. L. et al. The variable-loop region in SARS-CoV-2 shows closer identity to the 2019 pangolin coronavirus sequence than to the RaTG13 bat virus, supported by phylogenetic inference (Fig. PLoS Pathog. Nevertheless, the viral population is largely spatially structured according to provinces in the south and southeast on one lineage, and provinces in the centre, east and northeast on another (Fig. Trova, S. et al. It is RaTG13 that is more divergent in the variable-loop region (Extended Data Fig. 84, 31343146 (2010). and P.L.) PANGOLIN lineage database (15, 16) was used to analyze the frequency of lineages among countries. When viewing the last 7kb of the genome, a clade of viruses from northern China appears to cluster with sequences from southern Chinese provinces but, when inspecting trees from different parts of ORF1ab, the N. China clade is phylogenetically separated from the S. China clade. J. Virol. Viruses 11, 979 (2019). The histogram allows for the identification of non-recombining regions (NRRs) by revealing regions with no breakpoints. This provides compelling support for the SARS-CoV-2 lineage being the consequence of a direct or nearly-direct zoonotic jump from bats, because the key ACE2-binding residues were present in viruses circulating in bats. Lancet 383, 541548 (2013). We compare both MERS-CoV- and HCoV-OC43-centred prior distributions (Extended Data Fig. CAS B.W.P. 4 we compare these divergence time estimates to those obtained using the MERS-CoV-centred rate priors for NRR1, NRR2 and NRA3. Extended Data Fig. A pneumonia outbreak associated with a new coronavirus of probable bat origin. These means are based on the mean rates estimated for MERS-CoV and HCoV-OC43, respectively, while the standard deviations are set ten times higher than empirical values to allow greater prior uncertainty and avoid strong bias (Extended Data Fig. Dudas, G., Carvalho, L. M., Rambaut, A. Pangolin relies on a novel algorithm called pangoLEARN. Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, PA, USA, Department of Microbiology, Immunology and Transplantation, KU Leuven, Rega Institute, Leuven, Belgium, Department of Biological Sciences, Xian Jiaotong-Liverpool University, Suzhou, China, State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, China, Department of Biology, University of Texas Arlington, Arlington, TX, USA, Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK, MRC-University of Glasgow Centre for Virus Research, Glasgow, UK, You can also search for this author in 5). and T.A.C. 88, 70707082 (2014). Holmes, E. C., Rambaut, A. We named the length-sorted BFRs as: BFRA (ntpositions 13,29119,628, length=6,338nt), BFRB (ntpositions 3,6259,150, length=5,526nt), BFRC (ntpositions 9,26111,795, length=2,535nt), BFRD (ntpositions 27,70228,843, length=1,142nt) and six further regions (EJ). This statement informs us of the possibility that a virus has spilled over from a very rare and shy reptile-looking mammal . Specifically, using a formal Bayesian approach42 (see Methods), we estimate a fast evolutionary rate (0.00169 substitutions per siteyr1, 95% highest posterior density (HPD) interval (0.00131,0.00205)) for SARS viruses sampled over a limited timescale (1year), a slower rate (0.00078 (0.00063,0.00092) substitutions per siteyr1) for MERS-CoV on a timescale of about 4years and the slowest rate (0.00024 (0.00019,0.00029) substitutions per siteyr1) for HCoV-OC43 over almost five decades. By mid-January 2020, the virus was spreading widely within Hubei province and by early March SARS-CoV-2 was declared a pandemic8. This dataset comprises an updated version of that used in Hon et al.15 and includes a cluster of genomes sampled in late 2003 and early 2004, but the evolutionary rate estimate without this cluster (0.00175 substitutions per siteyr1 (0.00117,0.00229)) is consistent with the complete dataset (0.00169 substitutions per siteyr1, (0.00131,0.00205)). It is clear from our analysis that viruses closely related to SARS-CoV-2 have been circulating in horseshoe bats for many decades. 382, 11991207 (2020). The command line tool is open source software available under the GNU General Public License v3.0. Pink, green and orange bars show BFRs, with regionA (nt 13,29119,628) showing two trimmed segments yielding regionA (nt13,29114,932, 15,40517,162, 18,00919,628). We used an uncorrelated relaxed clock model with log-normal distribution for all datasets, except for the low-diversity SARS data for which we specified a strict molecular clock model. The authors declare no competing interests. In the absence of any reasonable prior knowledge on the TMRCA of the sarbecovirus datasets (which is required for grid specification in a skygrid model), we specified a simpler constant size population prior. 190, 20882095 (2004). Given what was known about the origins of SARS, as well as identification of SARS-like viruses circulating in bats that had binding sites adapted to human receptors29,30,31, appropriate measures should have been in place for immediate control of outbreaks of novel coronaviruses. PubMed Pangolin-CoV is 91.02% and 90.55% identical to SARS-CoV-2 and BatCoV RaTG13, respectively, at the whole-genome level. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. A reduced sequence set of 25sequences chosen to capture the breadth of diversity in the sarbecoviruses (obvious recombinants not involving the SARS-CoV-2 lineage were also excluded) was used because GARD is computationally intensive. 04:20. On first examination this would suggest that that SARS-CoV-2 is a recombinant of an ancestor of Pangolin-2019 and RaTG13, as proposed by others11,22. The difficulty in inferring reliable evolutionary histories for coronaviruses is that their high recombination rate48,49 violates the assumption of standard phylogenetic approaches because different parts of the genome have different histories. After removal of A1 and A4, we named the new region A. Our most conservative approach attempted to ensure that putative NRRs had no mosaic or phylogenetic incongruence signals. Green boxplots show the TMRCA estimate for the RaTG13/SARS-CoV-2 lineage and its most closely related pangolin lineage (Guangdong 2019), with the light and dark coloured version based on the HCoV-OC43 and MERS-CoV centred priors, respectively. Accurate estimation of ages for deeper nodes would require adequate accommodation of time-dependent rate variation. # File containing the ID of the samples, the Sequence of the haplotype, the Continent, the country, the Region, the Data, the Lineage of Pangolin and Nextstrain clade, and the haplotype number # In this order # Could be obtained from the database There is a 90% DNA match between SARS CoV 2 and a coronavirus in pangolins.
A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist Viruses 11, 174 (2019). collected SARS-CoV data and assisted in analyses of SARS-CoV and SARS-CoV-2 data. 5). Bioinformatics 28, 32483256 (2012). Liu, P. et al. These are in general agreement with estimates using NRR2 and NRA3, which result in divergence times of 1982 (19482009) and 1948 (18791999), respectively, for SARS-CoV-2, and estimates of 1952 (19061989) and 1970 (19321996), respectively, for the divergence time of SARS-CoV from its closest known bat relative. Lancet 395, 565574 (2020). To begin characterizing any ancestral relationships for SARS-CoV-2, NRRs of the genome must be identified so that reliable phylogenetic reconstruction and dating can be performed.
COVID-19: A Catastrophe or Opportunity for Pangolin Conservation? - Nature . Pangolin was developed to implement the dynamic nomenclature of SARS-CoV-2 lineages, known as the Pango nomenclature. However, on closer inspection, the relative divergences in the phylogenetic tree (Fig. SARS-CoV-2 itself is not a recombinant of any sarbecoviruses detected to date, and its receptor-binding motif, important for specificity to human ACE2 receptors, appears to be an ancestral trait shared with bat viruses and not one acquired recently via recombination. Nature 583, 286289 (2020). Posterior rate distributions for MERS-CoV (far left) and HCoV-OC43 (far right) using BEAST on n=27 sequences spread over 4 years (MERS-CoV) and n=27 sequences spread over 49 years (HCoV-OC43). J. Virol. RegionsAC had similar phylogenetic relationships among the southern China bat viruses (Yunnan, Guangxi and Guizhou provinces), the Hong Kong viruses, northern Chinese viruses (Jilin, Shanxi, Hebei and Henan provinces, including Shaanxi), pangolin viruses and the SARS-CoV-2 lineage. For the HCoV-OC43, MERS-CoV and SARS datasets we specified flexible skygrid coalescent tree priors.
Frontiers | Novel Highly Divergent SARS-CoV-2 Lineage With the Spike Virus Evol. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. =0.00025. Lemey, P., Minin, V. N., Bielejec, F., Pond, S. L. K. & Suchard, M. A. To estimate non-synonymous over synonymous rate ratios for the concatenated coding genes, we used the empirical Bayes Renaissance countingprocedure67. Transparent bands of interquartile range width and with the same colours are superimposed to highlight the overlap between estimates. Patino-Galindo, J. Bioinformatics 22, 26882690 (2006). Eden, J.-S., Tanaka, M. M., Boni, M. F., Rawlinson, W. D. & White, P. A. Recombination within the pandemic norovirus GII.4 lineage. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. As a proxy, it would be possible to model the long-term purifying selection dynamics as a major source of time-dependent rates43,44,52, but this is beyond the scope of the current study. However, formal testing using marginal likelihood estimation41 does provide some evidence of a temporal signal, albeit with limited log Bayes factor support of 3 (NRR1), 10 (NRR2) and 3 (NRA3); see Supplementary Table 1. Natl Acad. Despite the high frequency of recombination among bat viruses, the block-like nature of the recombination patterns across the genome permits retrieval of a clean subalignment for phylogenetic analysis. In the presence of time-dependent rate variation, a widely observed phenomenon for viruses43,44,52, slower prior rates appear more appropriate for sarbecoviruses that currently encompass a sampling time range of about 18years. Lie, P., Chen, W. & Chen, J.-P. Evol. volume5,pages 14081417 (2020)Cite this article. Phylogenies of subregions of NRR1 depict an appreciable degree of spatial structuring of the bat sarbecovirus population across different regions (Fig. Microbiol. The rate of genome generation is unprecedented, yet there is currently no coherent nor accepted scheme for naming the expanding . The idea is that pangolins carrying the virus, SARS-CoV-2, came into contact with humans. By 2009, however, rapid genomic analysis had become a routine component of outbreak response. The S1 protein of Pangolin-CoV is much more closely related to SARS-CoV-2 than to RaTG13. Intraspecies diversity of SARS-like coronaviruses in Rhinolophus sinicus and its implications for the origin of SARS coronaviruses in humans. Press, 2009). J. Virol. PubMed Central Rev. This long divergence period suggests there are unsampled virus lineages circulating in horseshoe bats that have zoonotic potential due to the ancestral position of the human-adapted contact residues in the SARS-CoV-2 RBD. PubMed Central & Holmes, E. C. A genomic perspective on the origin and emergence of SARS-CoV-2. You are using a browser version with limited support for CSS. 1c). Across a large region of the virus genome, corresponding approximately to ORF1b, it did not cluster with any of the known bat coronaviruses indicating that recombination probably played a role in the evolutionary history of these viruses5,7. In March, when covid cases began spiking around India, Bani Jolly went hunting for answers in the virus's genetic code. Means and 95% HPD intervals are 0.080 [0.0580.101] and 0.530 [0.3040.780] for the patristic distances between SARS-CoV-2 and RaTG13 (green) and 0.143 [0.1090.180] and 0.154 [0.0930.231] for the patristic distances between SARS-CoV-2 and Pangolin 2019 (orange). Results and discussion Genomic surveillance has been a hallmark of the COVID-19 pandemic that, in contrast to other pandemics, achieves tracking of the virus evolution and spread worldwide almost in real-time ( 4 ). Humans' selfish, speciesist treatment of these animals could be the very reason why the novel coronavirus exists. 4 TMRCAs for SARS-CoV and SARS-CoV-2. Bryant, D. & Moulton, V. Neighbor-Net: an agglomerative method for the construction of phylogenetic networks. This leaves the insertion of polybasic. 5 Comparisons of GC content across taxa. https://doi.org/10.1093/molbev/msaa163 (2020). A hypothesis of snakes as intermediate hosts of SARS-CoV-2 was posited during the early epidemic phase54, but we found no evidence of this55,56; see Extended Data Fig.