Genomic Characteristics of 12 HBV-I Strains in the 2020 National HBV Serosurvey in China
Hepatitis B virus (HBV) remains a significant global public health challenge, with China bearing a substantial burden of chronic HBV infection. Over the past three decades, China has made remarkable progress in combating hepatitis B. HBV exhibits high genetic diversity due to the absence of proofreading activity in its polymerase. It is categorized into ten distinct genotypes (A–J), with over 40 subgenotypes identified to date. Among these, HBV genotype I (HBV-I) is an inter-genotypic recombinant that has emerged during the evolutionary history of HBV. This study provides a comprehensive analysis of the genomic characteristics of HBV-I in China, employing methodologies such as phylogenetic analysis, nucleotide homology assessment, examination of amino acid substitutions within the PreS/S region, recombination detection, and evolutionary analysis.
Introduction
HBV infection continues to pose a significant public health challenge worldwide. In China, the burden of chronic HBV infection is particularly high. The genetic diversity of HBV, driven by its lack of proofreading activity, has led to the classification of the virus into ten genotypes and over 40 subgenotypes. HBV-I, a complex recombinant, is primarily found in Asia, including countries such as Vietnam, Laos, China, Thailand, and India. It comprises segments from genotypes A, G, C, and an unknown genotype (U). The recombination phenomenon in HBV-I poses challenges to current vaccine and antiviral therapy strategies. Mutations in the HBV PreS/S gene can lead to alterations in the antigenic conformation of the hepatitis B surface antigen (HBsAg), impaired synthesis or virion secretion, reduced viral replication capacity, and compromised immune control. Understanding the genomic characteristics of HBV-I is crucial for the prevention, control, and treatment of hepatitis B.
Materials and Methods
Specimen Characteristics
A nationwide serological survey of HBV was conducted in 2020, encompassing 91,971 participants aged 1–69 from 120 national disease surveillance points across 31 provinces in China. The survey identified 12 HBV-I strains. The HBV carriers included 3 individuals of Han ethnicity, 8 of Zhuang ethnicity, and 1 of Yao ethnicity, with a median age of 50.5 years and a male proportion of 66.7%. The mean serum HBV DNA level was 4.78 log10 IU/mL. All participants were positive for HBsAg and negative for hepatitis B surface antibody (anti-HBs). The positive rates of hepatitis B core antibody (anti-HBc), hepatitis B e antigen (HBeAg), and hepatitis B e antibody (anti-HBe) were 75.0%, 16.7%, and 83.3%, respectively.
Amplification and Sequencing of Full-Length HBV DNA
HBV DNA was extracted from 200 μL of HBsAg-positive serum using the Magnetic Bead Method Viral RNA/DNA Extraction Kit. The full-length HBV genome was amplified using a nested and overlapping polymerase chain reaction (PCR). Six overlapping fragments were generated in two rounds of PCR amplification. The PCR products were purified and sequenced by Sangon Biotech Co., Ltd. The full-length HBV genome was assembled by merging six overlapping fragments, along with the basal core promoter (BCP) fragment, using SeqMan Pro Lasergene software.
Phylogenetic Analysis
Sequence alignment was performed using Clustal W, and phylogenetic analysis was conducted using the S gene and/or the full-length sequence. The maximum likelihood (ML) method based on the General Time Reversible plus Gamma distribution plus Invariant sites (GTR + G + I) nucleotide substitution model was used to ascertain the HBV genotypes. Phylogenetic testing was executed using the bootstrap method with 1,000 bootstrap replicates, as implemented in MEGA XI software. The reference sequences encompassed 55 HBV genome sequences representing 10 genotypes and 34 subgenotypes.
Nucleotide Homology Analysis and Amino Acid Substitutions in the PreS/S Region
Homology analysis was conducted on the 7 full-length and 12 S gene sequences of HBV-I strains and 97 HBV-I reference sequences. Homology scores were calculated using the Sequence Identity Matrix in Bioedit software. Amino acid sequences of the PreS/S region were obtained by translating the corresponding nucleotide sequences using BioAider software, and subsequent analysis of substitutions was performed.
Recombination and Evolutionary Analysis of HBV-I Strains
Recombination analysis was conducted using SimPlot and RDP software. Seven algorithms, including RDP, GENECONV, Chimaera, MaxChi, BootScan, SiScan, and 3Seq, were employed to analyze the recombination breakpoints and characteristics of HBV-I strains in comparison with reference strains. Evolutionary analysis followed the methodology outlined in Q. Chen’s report. The time-scaled phylogenetic tree was constructed using PhyloSuite v1.2.3, and the maximum clade credibility (MCC) tree was constructed using the TreeAnnotator in BEAST.
Results
Classification of HBV Strains
The 12 HBV strains were classified into subgenotype I1 and serotype adw2, with one exception identified as ayw1. The genome sequences of five HBV strains ranged in length from 3,148 to 3,189 base pairs (bp). The sequences of HBV-2968 and HBV-2974 were notably shorter, at 2,656 bp and 2,644 bp, respectively, due to the absence of certain nucleotides. Based on both the full-length HBV DNA and S gene sequence, all HBV strains were classified as HBV genotype I1. Serotype classification was determined by examining specific amino acid positions. Except for HBV-2777, classified as ayw1, the remaining 11 strains were identified as adw2.
Phylogenetic Analysis
Phylogenetic analysis was conducted on seven approximate full-length HBV sequences. HBV-2984 and AF241408 (Vietnam, 1988) were found within the same phylogenetic tree cluster. HBV-2739 and HBV-2741 exhibited a phylogenetic relationship to FR714490 (China, 2004). Four strains, HBV-2961, HBV-3014, HBV-2974, and HBV-2968, were grouped together with GU357844 (China, 2008). A phylogenetic analysis of the 12 S gene sequences revealed that the evolutionary distances were most closely aligned between ten strains and two reference strains (FR714490 and GU357844, China). Two additional strains (HBV-2777 and HBV-2984) were clustered with AF241408 (Vietnam, 1988).
Homology Analysis
Homology analysis was conducted on 7 full-length sequences of HBV-I strains. These seven strains were preliminarily grouped into three clusters. HBV-2739 and HBV-2741 had the highest homology to AF241409 (Vietnam, 1988). HBV-2961 showed the closest homology to FR714493 (China, 2004). HBV-2984 exhibited the highest homology to AF241407 (China, 2004). HBV-3014 was closely related to KF425553 (China, 2013). Homology analysis for HBV-2968 and HBV-2974 was not performed due to the absence of many sequence fragments. Using 97 HBV-I strains as references, homology analysis was also conducted based on S gene sequences. These 12 HBV strains were preliminarily categorized into four clusters. HBV-2739 and HBV-2741 showed the highest homology to FR714490 (China, 2004). HBV-2777 and HBV-2984 exhibited the closest homology to AF241408 (Vietnam, 1988). Four HBV strains were closest to GU357844 (China, 2008). HBV-2960 and HBV-2968 were the closest to three reference strains. HBV-2961 demonstrated the highest homology to FR714493 (China, 2004). HBV-2994 shared the same score with four reference strains.
Amino Acid Substitutions in the PreS/S Region
The open reading frame (ORF) of LHBsAg exhibited complexity. A comparison with FR714490 revealed 61 distinct nucleotide substitutions in the PreS region across the 12 strains. Notably, the substitutions T3081C, C3040T, G25A, and C144T occurred with probabilities of 100.00%, 33.33%, 25.00%, and 25.00%, respectively. Eleven other nucleotide substitutions were observed in 16.67% of the strains, while the remaining 46 substitutions were found in only 8.33% of the strains. Substitutions at all sites were predominantly of a single type. Substitutions were categorized into non-synonymous and synonymous substitutions. A total of 35 distinct amino acid substitutions were identified, affecting 32 amino acids. Of these, 17 amino acid substitutions resulted in a change in the amino acid properties.
Recombination and Evolutionary Analysis
Recombination analysis was performed on seven HBV strains. Bootscan analysis identified these strains as recombinants of HBV genotypes A, G, and C, with recombination sites approximately mapped to nucleotides 397, 1397, and 2943. With all genotype reference sequences of HBV included in the analysis, nucleotides 2943–397, 398–1397, and 1398–2943 of the seven HBV strains corresponded to genotypes A, G, and C, respectively. However, further analysis indicated that these regions differed from their respective genotypes A, G, and C.
A time-scale phylogenetic tree was constructed based on 102 full-length HBV-I gene sequences. Five sequences were identified in three evolutionary branches: HBV-3014 and HBV-2961; HBV-2739 and HBV-2741; HBV-2984. The average clock rate of the HBV-I genome was estimated at 1.17 exp(−4) substitutions/site/year, and the most recent common ancestor dated back to year 1774. Additionally, a time-scale phylogenetic tree was constructed based on 109 HBV-I S gene sequences. Twelve sequences were distributed across six evolutionary branches. The average clock rate of the HBV-I S gene was 1.61 exp(−4) substitutions/site/year, with the most recent common ancestor estimated to have appeared in year 1740.
Discussion
The World Health Organization has set forth an initiative for the global elimination of viral hepatitis by 2030. China has made significant contributions in preventing, controlling, and treating hepatitis B. The 12 HBV strains from the 2020 national HBV serosurvey in China were classified into genotype I according to phylogenetic analysis, with all samples collected from Guangxi Zhuang Autonomous Region. This distribution pattern offers valuable insights for tracing the origins of HBV-I cases. All 12 HBV-I strains belonged to the I1 subgenotype, with no evidence of the I3 subgenotype. HBV-I strains exhibited two serotypes, ayw1 and adw2, primarily due to the derivation of the HBs segment from genotypes A and G.
Homology analysis revealed that two HBV-I strains exhibited the highest homology with an early HBV-I strain from Vietnam. The remaining HBV-I strains were transmitted locally. The rate of amino acid substitution in the PreS region was higher than that observed in the HBs region. Several mutations associated with immune escape were identified, underscoring the importance of monitoring the mutations in the S gene.
Recombinants involving two genotypes are frequently observed, while those involving three genotypes are less common. The corresponding segments of HBV-I may have originated from ancient HBV strains and have evolved independently after recombination. The average clock rate of the HBV-I genome was 1.17 exp(−4) to 1.61 exp(−4) substitutions/site/year. The most recent common ancestor dates back to between year 1740 and 1774. Genotype I has been a minor genotype, never achieving dominance in its circulation areas, which accounts for its late discovery.
Conclusions
All 12 HBV-I strains, categorized into subgenotype I1 and serotype adw2 with one exception being ayw1, were preliminarily grouped into clusters based on homology analysis. A higher substitution rate was observed in the antigenic loop of the HBsAg, with potential immune-escape mutations in HBV-I strains. The average molecular clock rate for HBV-I ranges from 1.17 exp(−4) to 1.61 exp(−4) substitutions/site/year, with the most recent common ancestor estimated to have existed between year 1740 and 1774. Epidemiological surveillance and analysis of genomic characteristics of HBV genotype I are significant for further prevention and control of hepatitis B.
doi.org/10.1016/j.bsheal.2025.01.007
Was this helpful?
0 / 0