Optimising neural network hyperparameters by predicting unseen architectures with learning curve approach
Smith, C. and Wong, T.C. (2026) Optimising neural network hyperparameters by predicting unseen architectures with learning curve approach. Applied Soft Computing, 187. 114340. ISSN 1568-4946 (https://doi.org/10.1016/j.asoc.2025.114340)
Preview |
Text.
Filename: Smith-Wong-ASC-2025-Optimising-neural-network-hyperparameters-by-predicting-unseen-architectures.pdf
Accepted Author Manuscript License:
Download (1MB)| Preview |
Abstract
Hyperparameter optimisation (HPO) is a critical yet computationally expensive task in training neural networks. While recent research has used learning curve prediction to terminate poor configurations early, no existing method can predict the full set of unseen learning curves within a search space. Current techniques often rely on meta-learning across datasets or partial curve information, limiting both generalisability and precision. This study introduces SEquential LEarning Curve Training (SELECT), a novel HPO approach that accurately predicts the full learning curves of all unseen hyperparameter configurations using a sequence prediction model trained on a subset of the same dataset. SELECT employs a new representation of learning curve data—converted into structured sequences with aligned starting points—and leverages a Convolutional Gated Recurrent Neural Network (CGRNN) for high-fidelity forecasting. Benchmarked across multiple real-world regression datasets, SELECT consistently outperformed established methods including Random Search (RS), Hyperband (HB), Gaussian Process Bayesian Optimisation (GPBO), and Tree-structured Parzen Estimator (TPE), achieving up to 20% improvements in predictive accuracy while maintaining consistent and predictable computation time. Its design enables full parallel training of configurations, making it highly suitable for large-scale or time-sensitive applications. In addition to its performance and scalability, SELECT offers a unique advantage in search space visualisation. By predicting entire learning curves, it allows practitioners to inspect the shape and structure of the optimisation landscape, revealing patterns such as diminishing returns and performance plateaus. This transparency enhances trust, interpretability, and decision-making in both research and industrial HPO workflows.
ORCID iDs
Smith, C. and Wong, T.C.
ORCID: https://orcid.org/0000-0001-8942-1984;
-
-
Item type: Article ID code: 94831 Dates: DateEvent1 February 2026Published3 December 2025Published Online25 November 2025AcceptedSubjects: Science > Mathematics > Electronic computers. Computer science Department: Faculty of Engineering > Design, Manufacture and Engineering Management Depositing user: Pure Administrator Date deposited: 27 Nov 2025 10:43 Last modified: 03 Feb 2026 17:33 URI: https://strathprints.strath.ac.uk/id/eprint/94831
Tools
Tools






