Network graph based neural architecture search

Authors
Huang, Zhenhan
ORCID
Loading...
Thumbnail Image
Other Contributors
Magdon-Ismail, Malik
Gittens, Alex
Gao, Jianxi
Issue Date
2022-05
Keywords
Computer science
Degree
MS
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute (RPI), Troy, NY. Copyright of original work retained by author.
Full Citation
Abstract
Neural network exhibits a great potential in various applications. The manual design of neural architecture requires specialized expertise and huge amount of time. Neural architecture search enabling the automation of architecture engineering achieves a great success in some benchmarks \cite{domhan2015speeding}. However, the neural architecture search method is time consuming as it performs machine learning on all neural architecture candidates. Even though the knowledge about manually designed neural architecture can decrease the size of search space, it inevitably introduces human bias. We propose a new way to find favorable neural architecture which uses a predictor to predict machine learning performance based on graph properties. Without checking the real performance of all neural architecture candidates, the searching process is much faster compared to traditional neural architecture search routine. The predictor uses 10 graph features and linear model. To help neural architecture design and improve computational efficiency, we check the interchangeability of those graph features in the role of prediction. We find the interchangeability seems to be related to the physical meaning of properties. This correlation is also found in the result of feature selection algorithms. Different feature selection algorithms pick the first feature, though different, with same physical meaning. We check the performance of the predictor by comparing the predicted performance and real performance of both multilayer perceptrons (MLP) and convolution neural network (CNN) architectures. Based on the predicted performance, we use a random rewiring strategy to achieve forward rewiring (decrease machine learning error) and backward rewiring (increase machine learning error). The prediction is consistent with the actual performance of neural architectures within a limited error tolerance. To accelerate the optimal neural architecture searching process, we improve the random rewiring strategy by proposing adaptive rewiring strategy. This strategy is found to be extremely fast compared to random rewiring strategy and more importantly, it avoid the problem of premature convergence existed in the random rewiring strategy. Due to the predictor, we are able to find a neural architecture with superior or inferior performance. The statistical result confirms this finding and the robustness of the prediction.
Description
May 2022
School of Science
Department
Dept. of Computer Science
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection
Access
Restricted to current Rensselaer faculty, staff and students in accordance with the Rensselaer Standard license. Access inquiries may be directed to the Rensselaer Libraries.