Multi-view Clustering Based on Structured Tensor Learning
Introduction
Multi-view clustering has emerged as a prominent research area due to the increasing availability of data from diverse sources. Traditional clustering methods often struggle with high-dimensional data, leading to the development of advanced techniques such as subspace clustering, graph-based clustering, and deep learning-based clustering. Among these, subspace clustering methods, particularly those based on spectral clustering, have gained significant attention. However, most existing clustering methods underestimate the impact of noise and fail to fully exploit the complementary structural information inherent in multi-view data. Additionally, many approaches neglect the reverse guidance that clustering results can provide to the optimization process of low-rank tensors, potentially leading to suboptimal solutions.
To address these challenges, this paper introduces a novel method called Multi-view Clustering based on Structured Tensor Learning (MCSTL). The proposed approach integrates noise reduction, structural learning, and spectral clustering into a unified framework, enhancing clustering accuracy and robustness.
Background and Motivation
Challenges in Multi-view Clustering
Multi-view data, collected from multiple sources or perspectives, provides richer information than single-view data. For example, an object can be described using text, images, or other feature types, each offering a unique perspective. While this diversity enhances data representation, traditional single-view clustering methods cannot effectively leverage the consistency and complementarity of multi-view data, resulting in subpar performance.
Existing multi-view clustering methods can be broadly categorized into two types: those based on similarity matrices and those based on subspace clustering. The latter, particularly multi-view subspace clustering, has become a focal point in research. These methods often extend single-view techniques such as Low-Rank Representation (LRR) and Sparse Subspace Clustering (SSC) to handle multi-view data. However, they still face several limitations:
- Noise Sensitivity: Most methods focus on noise in raw data but overlook noise introduced during the construction of representation tensors, which can degrade clustering performance.
- Underutilization of Structural Information: Many approaches fail to fully exploit the complementary structural information, including local, global, and cross-view correlations.
- Isolated Optimization: Traditional methods separate tensor learning and feature matrix learning, ignoring the potential of clustering results to guide tensor optimization.
Contributions of MCSTL
The MCSTL method addresses these limitations through several key innovations:
- Enhanced Noise Reduction: Unlike existing methods, MCSTL performs secondary noise removal on the initial representation tensor, ensuring higher accuracy and robustness.
- Comprehensive Structural Learning: The method simultaneously learns local structures, global structures, and high-order correlations across views, improving the alignment between the representation tensor and the intrinsic cluster structure of the data.
- Joint Optimization: MCSTL integrates tensor learning and feature matrix learning, allowing the clustering structure to guide tensor optimization, leading to better solutions.
- Orthogonal Constraint: An orthogonal constraint is applied to the feature matrix, providing soft label information and enabling direct clustering interpretation.
Methodology
Overview of MCSTL
The MCSTL framework consists of four main components:
- Noise Reduction: The initial representation tensor is refined by removing additional noise, resulting in a cleaner and more reliable tensor.
- Structural Learning: Local, global, and cross-view structural information is jointly learned to enhance the representation tensor’s consistency with the underlying data clusters.
- Cross-view Information Fusion: A unified feature matrix is derived from the affinity matrices of different views, capturing consistent clustering information.
- Clustering Guidance: The feature matrix’s implicit clustering structure is used to guide the optimization of the representation tensor, ensuring better alignment with the true cluster structure.
Key Steps in MCSTL
- Initial Representation Tensor Construction: The method begins by constructing an initial representation tensor from multi-view data.
- Secondary Noise Removal: Noise introduced during tensor construction is further reduced, improving the tensor’s accuracy.
- Local and Global Structure Learning: Pairwise sample differences are computed to learn local structures, while low-rank constraints capture global structures.
- High-order Correlation Exploration: A third-order tensor is used to model high-order correlations across different views.
- Feature Matrix Learning: A unified feature matrix is learned from the fused affinity matrices, providing a consistent clustering structure.
- Orthogonal Constraint Application: The feature matrix is constrained to be orthogonal, simplifying clustering interpretation.
- Iterative Optimization: An efficient optimization algorithm is employed to iteratively refine the representation tensor and feature matrix.
Optimization Process
The optimization process involves alternating updates of the representation tensor, noise tensor, and feature matrix. The algorithm ensures convergence by iteratively minimizing the objective function while satisfying constraints. Key steps include:
• Updating the Representation Tensor: The tensor is updated to balance noise reduction and structural learning.
• Learning the Feature Matrix: The feature matrix is derived from the affinity matrices, ensuring consistency across views.
• Applying Orthogonal Constraints: The feature matrix is constrained to maintain orthogonality, facilitating direct clustering.
Experimental Results
Datasets and Evaluation Metrics
The performance of MCSTL was evaluated on five real-world datasets:
- BBC-4view: A news article dataset with 685 samples and four feature types.
- BBC-sport: A sports news dataset with 544 samples and two feature types.
- UCI-3view: A handwritten digit dataset with 2,000 samples and three feature types.
- StillDB: A static action image dataset with 467 samples and three feature types.
- Flowers: A flower image dataset with 1,360 samples and three feature types.
Six evaluation metrics were used to assess clustering performance:
- Accuracy (ACC): Measures the proportion of correctly clustered samples.
- Normalized Mutual Information (NMI): Evaluates the similarity between clustering results and true labels.
- Adjusted Rand Index (ARI): Quantifies the agreement between clustering results and ground truth.
- Precision (P): Measures the proportion of true positives in clustered samples.
- Recall: Measures the proportion of true positives identified.
- F-score: Combines precision and recall into a single metric.
Performance Comparison
MCSTL was compared against 14 state-of-the-art clustering methods, including SSC, LRR, RMSC, LT-MSC, and CLR-MVP. The results demonstrated MCSTL’s superiority across all datasets and metrics:
• BBC-sport: MCSTL achieved perfect clustering (100% accuracy), outperforming all other methods.
• UCI-3view: The method achieved near-perfect clustering (99.9% accuracy), significantly surpassing competitors.
• StillDB: MCSTL showed substantial improvements in recall and other metrics compared to baseline methods.
• Flowers and BBC-4view: Despite the datasets’ complexity, MCSTL maintained stable and superior performance.
Ablation Study
An ablation study confirmed the importance of secondary noise removal. Removing this step (MCSTL-baseline) led to noticeable performance degradation, highlighting the necessity of noise reduction in representation tensor construction.
Convergence Analysis
MCSTL exhibited rapid convergence, typically stabilizing within 60 iterations. This efficiency makes it suitable for large-scale datasets.
Applications
Practical Use Cases
MCSTL’s effectiveness was demonstrated in real-world applications, such as:
- Text Classification: The method successfully clustered news articles into distinct topics.
- Image Recognition: MCSTL accurately grouped handwritten digits and flower images based on visual features.
- Disease Subtyping: The framework’s ability to handle multi-view data makes it applicable to biomedical clustering tasks.
Case Study: Handwritten Digit Clustering
A detailed analysis of the UCI-3view dataset revealed that MCSTL correctly clustered most handwritten digits, with only minor misclassifications (e.g., some ‘4’s and ‘8’s were incorrectly grouped with ‘6’s). Visualization using t-SNE confirmed that MCSTL’s feature matrix produced well-separated clusters, unlike raw data views where classes overlapped.
Conclusion
The MCSTL method represents a significant advancement in multi-view clustering by addressing key limitations of existing approaches. Its innovative integration of noise reduction, structural learning, and joint optimization ensures high accuracy and robustness across diverse datasets. Experimental results demonstrated MCSTL’s superiority, with 27 out of 30 evaluated metrics achieving optimal performance.
Future work will explore extending MCSTL to incomplete multi-view clustering and optimizing its scalability for larger datasets. The method’s versatility and effectiveness make it a valuable tool for various clustering applications, from text analysis to biomedical research.
doi.org/10.19734/j.issn.1001-3695.2024.07.0278
Was this helpful?
0 / 0