Introduction
Complex networks, which abstract real-world systems into network models, provide an effective framework for understanding intricate systems. Examples include social networks, transportation networks, and communication networks. Identifying key nodes—nodes that significantly influence network structure and function—is crucial in applications such as detecting influential scholars, transportation hubs, and social media influencers.
Existing methods for key node identification often rely on a combination of global and local node features. Traditional approaches include degree centrality, K-shell decomposition, and eigenvector centrality. However, these methods often fail to comprehensively capture global node characteristics, limiting their accuracy.
To address these limitations, this paper proposes a novel method for identifying key nodes in complex networks using an improved Double Deep Q-Network (DDQN) algorithm. The method enhances DDQN by refining initial reward values, introducing backtracking exploration, and implementing priority access mechanisms. Additionally, it integrates local features through clustering coefficients to improve identification accuracy.
Background
Traditional Centrality Measures
Several traditional centrality measures are commonly used in complex network analysis:
- Eigenvector Centrality (EC): Measures a node’s influence based on the importance of its neighbors.
- Betweenness Centrality (BC): Quantifies how often a node lies on the shortest paths between other nodes.
- Closeness Centrality (CC): Evaluates how close a node is to all other nodes in the network.
- Clustering Coefficient (C): Assesses the degree to which a node’s neighbors are interconnected.
While these metrics provide valuable insights, they often focus on either local or global aspects, leading to incomplete assessments of node importance.
Challenges in Existing Methods
Current key node identification methods face two main challenges:
- Incomplete Global Feature Extraction: Many methods only consider a limited neighborhood around a node, missing broader network influences.
- Training Efficiency and Stability in DDQN: Standard DDQN algorithms suffer from slow convergence and unstable training outcomes, particularly in complex networks with numerous neighbors.
Proposed Method
The proposed method consists of three main components:
- Initial Reward Value Construction
- Improved DDQN Algorithm
- Node Importance Ranking
- Initial Reward Value Construction
To better guide the DDQN algorithm, initial reward values are reconstructed using a combination of global centrality measures (EC, BC, CC) weighted by both subjective (Analytic Hierarchy Process, AHP) and objective (entropy weight method) criteria.
- AHP determines subjective weights by comparing the relative importance of each centrality measure.
- Entropy Weight Method calculates objective weights based on the information entropy of each measure.
- The final reward for each node is a weighted sum of its EC, BC, and CC values.
This approach ensures that the reward values reflect both expert knowledge and data-driven insights.
- Improved DDQN Algorithm
The standard DDQN algorithm is enhanced with two key innovations:
a) Backtracking Exploration
To improve training efficiency, a backtracking mechanism is introduced. During exploration:
- A visited node set is maintained to avoid redundant paths.
- If a node has no unvisited neighbors, the algorithm backtracks to the previous node and explores alternative paths.
This ensures that each training episode efficiently explores unique paths without repetition.
b) Priority Access
To stabilize training outcomes, a priority access mechanism is implemented:
- Nodes with below-average visit counts are prioritized in subsequent exploration phases.
- The priority set is updated periodically to ensure balanced exploration.
This prevents the algorithm from converging prematurely to suboptimal solutions.
- Node Importance Ranking
Node importance is determined by combining global and local features:
- Global Feature (OWP): Measures how frequently a node appears in optimal paths between other nodes, reflecting its global influence.
- Local Feature (OCC): Sums the clustering coefficients of a node’s immediate neighbors, capturing its local connectivity.
The final importance score is a weighted combination of OWP and OCC, where the optimal weight is determined experimentally.
Experimental Results
Datasets
Seven real-world networks were used for evaluation:
- Polbooks: A network of political books and their citations.
- USAir97: A flight connection network.
- Jazz: A social network of jazz musicians.
- Celegant: A neural network of C. elegans.
- Bio-Celegant: A metabolic network.
- CAG_mat364: A combinatorial relationship network.
- Ia-infect-hyper: A human interaction network.
Evaluation Metrics
- Training Efficiency: Measured by training time and convergence episodes.
- Network Performance: Evaluates the impact of removing top nodes on network connectivity.
- SIR Model: Simulates disease spread to assess the influence of identified key nodes.
Key Findings
- Improved Training Efficiency
• The improved DDQN algorithm reduced training time from hundreds of minutes to tens of minutes. • Convergence episodes decreased from over 1,000 to around 200. - Enhanced Node Coverage
• Most nodes were visited over 100 times, ensuring comprehensive exploration. - Superior Key Node Identification
• The proposed method outperformed nine baseline methods in network performance tests. • In SIR simulations, the identified key nodes demonstrated stronger spreading capabilities over time. - Optimal Feature Fusion
• In sparse networks (e.g., Bio-Celegant), global features (OWP) dominated. • In dense networks (e.g., Jazz), local features (OCC) were more influential.
Ablation Study
Removing any component (initial reward construction, backtracking, or priority access) degraded performance, confirming their necessity.
Conclusion
This paper presents an advanced method for identifying key nodes in complex networks by improving the DDQN algorithm and integrating global and local features. The method significantly enhances training efficiency, stability, and identification accuracy. Future work will extend this approach to larger and directed networks.
DOI: 10.19734/j.issn.1001-3695.2024.09.0327
Was this helpful?
0 / 0