Genome-wide Long Non-coding RNA Association Study on Han Chinese Women Identifies lncHSAT164 as a Novel Susceptibility Gene for Breast Cancer
Introduction
Breast cancer remains a leading cause of cancer-related mortality among women globally, with rising incidence rates in China. While genome-wide association studies (GWAS) have identified over 180 breast cancer-associated single-nucleotide polymorphisms (SNPs) across more than 100 susceptibility loci, most reside in non-coding regions of the genome. Long non-coding RNAs (lncRNAs), which regulate diverse biological processes, including carcinogenesis, represent promising candidates for explaining the functional role of these non-coding variants. However, large-scale studies investigating the contribution of lncRNAs to breast cancer susceptibility in Han Chinese populations are lacking. This study addresses this gap by conducting the first genome-wide lncRNA association analysis in Han Chinese women, identifying novel loci and elucidating their functional relevance in breast cancer pathogenesis.
Methods
Study Design and Cohorts
A two-stage genome-wide lncRNA association study was performed using a custom-designed lncRNA array covering >800,000 SNPs. The discovery stage included 1,496 breast cancer patients and 1,257 healthy controls, while the replication stage involved 4,138 cases and 5,051 controls. All participants were unrelated Han Chinese women. Breast cancer cases were pathologically confirmed, and controls had no personal or family history of breast cancer. Peripheral blood samples and breast tissue specimens (cancer and adjacent non-cancerous tissues) were collected for genotyping and functional analyses.
lncRNA Array and Quality Control
The lncRNA array incorporated SNPs from the NONCODE database, RegulomeDB, and the 1000 Genomes Project. Key features included:
- 425,000 SNPs in non-coding regions.
- 8,187 SNPs in regulatory regions (promoters/enhancers).
- 11,466 SNPs in miRNA-binding regions.
- Overlap with 150,000 SNPs from the Illumina Human OmniZhongHua array.
Quality control (QC) steps excluded SNPs with call rates <99%, minor allele frequency (MAF) <0.0001, or Hardy-Weinberg equilibrium (HWE) deviations (P < 10⁻⁴). Population stratification was assessed using principal component analysis (PCA).
Functional Experiments
Cell Culture
Breast cancer cell lines (MCF7, T47D) and non-tumorigenic MCF10A cells were cultured under standard conditions. RNA extraction, RT-qPCR, and subcellular fractionation were performed to assess lncHSAT164 expression.
lncRNA Characterization
- Northern Blot and RACE: Determined transcript length and polyadenylation status.
- Subcellular Localization: Nuclear and cytoplasmic fractions were analyzed by RT-qPCR.
Gain- and Loss-of-Function Studies
- Overexpression: pcDNA3.1-lncHSAT164 plasmids were transfected into MCF7 and T47D cells.
- Knockdown: Lentiviral short hairpin RNAs (shRNAs) targeting lncHSAT164 were transduced into T47D cells.
Phenotypic Assays
- Colony Formation: Cells (1 × 10³/well) were cultured for 10–15 days, stained with crystal violet, and quantified.
- Apoptosis and Cell Cycle: Flow cytometry with Annexin V-APC/PI staining and PI/RNase A protocols, respectively.
Results
Identification of Breast Cancer-Associated SNPs
In the discovery stage, 1,675 SNPs with P < 10⁻⁴ were identified. Replication and meta-analysis confirmed two significant loci:
- rs11066150 (12q24.13, P_meta = 2.34 × 10⁻⁸, OR = 1.13): Located in intron 1 of the lncRNA transcript NONHSAT164009.1 (lncHSAT164).
- rs9397435 (6q25.1, P_meta = 4.32 × 10⁻³⁸, OR = 1.41): A known risk locus near CCDC170, validating array reliability.
lncHSAT164 Expression and Characterization
- Tissue Specificity: lncHSAT164 expression was elevated in breast cancer tissues compared to adjacent non-cancerous tissues (P < 0.05, Figure 2A).
- Cell Line Expression: Higher levels in T47D cells (1.8-fold vs. MCF10A, P < 0.01, Figure 2B).
- Transcript Structure: Northern blot and RACE confirmed a ~1,700-nucleotide polyadenylated transcript (Figure 2C).
- Subcellular Localization: Predominantly nuclear in MCF7 (83%) and T47D (76%) cells (Figure 2D).
Functional Role of lncHSAT164
Overexpression Effects
- Colony Formation: Increased clonogenicity in MCF7 (2.1-fold, P < 0.001) and T47D (1.7-fold, P < 0.01, Figure 2F–G).
Knockdown Effects
- Apoptosis: shRNA-mediated knockdown in T47D increased apoptosis (15.3% vs. 6.7% in controls, P < 0.01, Figure 3B,D).
- Cell Cycle Arrest: G2/M phase proportion rose from 12.1% to 21.4% (P < 0.05, Figure 3C,E).
- Colony Formation: Reduced clonogenic survival by 65% (P < 0.001, Figure 3F–G).
Discussion
This study identifies lncHSAT164 as a novel breast cancer susceptibility gene in Han Chinese women. The rs11066150 risk allele, located within lncHSAT164, demonstrates genome-wide significance and functional relevance. Mechanistically, lncHSAT164 promotes tumor growth by regulating cell cycle progression and apoptosis resistance, consistent with its nuclear localization and oncogenic phenotype in functional assays.
The lncRNA array design, incorporating non-coding SNPs and regulatory elements, proved effective in identifying both novel and established loci (e.g., rs9397435). However, limitations include the lack of mechanistic data linking rs11066150 to lncHSAT164 expression and the need for larger cohorts to validate associations. Future studies should explore lncHSAT164’s interaction partners and downstream targets, particularly in cell cycle pathways like Cyclin D1/CDK4.
Conclusion
By integrating GWAS with functional genomics, this work establishes lncHSAT164 as a key player in breast cancer pathogenesis. The findings highlight the importance of non-coding variants in cancer susceptibility and underscore lncRNAs as potential therapeutic targets. Further investigation into lncHSAT164’s molecular mechanisms may yield novel strategies for breast cancer prevention and treatment.
doi.org/10.1097/CM9.0000000000001429
Was this helpful?
0 / 0