Introduction
Symbolic regression is a powerful technique in data-driven modeling that automatically discovers mathematical expressions to describe relationships within a dataset. Unlike traditional regression methods, which require predefined functional forms, symbolic regression simultaneously searches for both the structure and coefficients of the expression, making it highly flexible and interpretable. This capability is particularly valuable in fields such as physics, finance, and materials science, where explicit mathematical models are essential for understanding underlying phenomena.
Despite its advantages, traditional genetic programming (GP) methods for symbolic regression suffer from inefficiencies due to random exploration in the vast space of possible expressions. The lack of direction in the search process often leads to slow convergence and suboptimal solutions. To address these limitations, this paper introduces a novel mutation operator called the Neural Network Operator (NNO), which leverages recurrent neural networks (RNNs) to guide the evolutionary process. By learning data features and optimizing expressions dynamically, the NNO significantly improves the efficiency and effectiveness of GP-based symbolic regression.
Background and Related Work
Symbolic Regression and Genetic Programming
Symbolic regression aims to find a mathematical expression that best fits a given dataset. The input typically consists of a matrix of input variables and a corresponding output vector, while the output is a mathematical expression composed of operators, variables, and constants. The goal is to minimize an error function that measures the discrepancy between the predicted and actual values.
Genetic programming, a popular approach for symbolic regression, mimics natural evolution by iteratively evolving a population of candidate expressions. The process involves selection, crossover, and mutation operations to generate new expressions with improved fitness. However, traditional GP methods often struggle with inefficiencies due to their reliance on random search and limited exploitation of data features.
Existing Approaches and Limitations
Several advancements have been made to enhance GP for symbolic regression. For instance, Gplearn, a classic GP method, uses tournament selection and various mutation strategies to evolve expressions. Pysr, a more recent approach, employs parallel populations, asynchronous migration, and complexity-based error metrics to improve search efficiency. Despite these improvements, both methods still face challenges in effectively utilizing data features to guide the search.
Deep learning-based approaches, such as AI Feynman and physics-guided neural networks, have shown promise in recovering symbolic expressions by leveraging neural networks to learn patterns in data. However, these methods often rely on predefined models or constraints, limiting their applicability to complex or unknown datasets.
Recent hybrid approaches combine GP with neural networks to enhance search efficiency. For example, some methods use neural networks to initialize populations or guide search directions. While these techniques offer improvements, they do not fully integrate neural networks into the evolutionary process, leaving room for further optimization.
Neural Network Operator: Design and Implementation
Overview of the Neural Network Operator
The Neural Network Operator (NNO) is a novel mutation operator designed to address the limitations of traditional GP methods. Unlike conventional mutation operators that rely on random changes, the NNO uses an RNN to learn data features and systematically optimize expressions. The operator is embedded within the GP framework, dynamically refining expressions during each generation to steer the population toward lower-error solutions.
Architecture of the Recurrent Neural Network
The NNO employs a Long Short-Term Memory (LSTM) network to process and generate mathematical expressions. The network architecture consists of:
- Input Layer: The input combines one-hot encodings of the parent and sibling nodes of the current symbol in the expression tree. This contextual information helps the network generate syntactically valid expressions.
- Hidden Layer: A single LSTM layer with 256 units captures both short-term and long-term dependencies in the sequence of symbols.
- Output Layer: A fully connected layer with a softmax activation produces a probability distribution over the possible symbols, enabling the generation of diverse expressions.
Key Steps in the Neural Network Operator
The NNO operates through three main steps:
- Prefix Sequence Generation: The operator converts the current best expression into a sequence of symbols using a prefix traversal. This sequence is then decomposed into progressively longer prefixes, each representing a partial expression.
- Prefix Sequence Completion: Each prefix is fed into the RNN, which generates multiple completions. The RNN uses a prior probability mask to enforce constraints such as expression length, nesting rules, and inverse operations, ensuring the generated expressions are mathematically meaningful.
- Expression Evaluation: The completed sequences are converted back into expression trees and evaluated for their fitness. The best-performing expression replaces the original in the population, driving the evolution toward lower-error solutions.
Training the Neural Network
The RNN is trained using a risk-seeking policy gradient method, which focuses on optimizing the highest-reward expressions rather than average performance. The training process involves:
- Reward Calculation: The reward for an expression is inversely related to its normalized root mean square error (NRMSE), encouraging the network to prioritize accurate solutions.
- Policy Gradient Updates: The network parameters are updated to maximize the expected reward, with a focus on the top-performing expressions. This approach ensures that the network learns to generate high-quality solutions efficiently.
Experimental Evaluation
Nguyen Dataset Experiments
The effectiveness of the NNO was evaluated on the Nguyen benchmark dataset, which includes 12 mathematical expressions of varying complexity. The experiments compared four methods:
- Gplearn: A traditional GP method.
- NN-Gplearn: Gplearn enhanced with the NNO.
- Pysr: A state-of-the-art GP method with parallel populations.
- NN-Pysr: Pysr enhanced with the NNO.
The results demonstrated that the NNO significantly improved both the recovery rate of correct expressions and the convergence speed. NN-Gplearn and NN-Pysr achieved higher recovery rates and faster error reduction compared to their baseline counterparts. Notably, NN-Pysr recovered all 12 expressions in the Nguyen dataset, showcasing the operator’s ability to handle diverse mathematical forms.
Macroeconomic Dataset Experiments
To assess real-world applicability, the NNO was tested on a macroeconomic dataset generated using a multi-agent simulation model. The dataset included variables such as money supply, price levels, and transaction volumes. The experiments measured the average error and coefficient of determination (R²) across 30 independent runs.
The results showed that NN-Gplearn and NN-Pysr consistently outperformed their baseline methods, achieving lower errors and higher R² values. The best-performing expression, discovered by NN-Pysr, aligned with Fisher’s equation from economics, demonstrating the operator’s ability to uncover meaningful relationships in complex datasets.
Computational Efficiency
The NNO’s computational overhead was analyzed by measuring its training and inference times across datasets of varying sizes. The results indicated that the operator’s time and space complexity scaled linearly with dataset size, with parallelization mitigating some of the computational costs. This scalability makes the NNO suitable for large-scale applications.
Conclusion
The Neural Network Operator represents a significant advancement in genetic programming for symbolic regression. By integrating an RNN into the mutation process, the NNO effectively guides the evolutionary search, leading to faster convergence and higher-quality solutions. Experimental results on both synthetic and real-world datasets confirm the operator’s superiority over traditional GP methods.
Future work could explore alternative neural architectures, such as Transformers, to further enhance the operator’s generalization capabilities. Additionally, reducing the need for dataset-specific training would make the NNO more versatile for practical applications.
DOI: 10.19734/j.issn.1001-3695.2024.09.0337
Was this helpful?
0 / 0