Latency-Constrained Neural Architecture Search for U-Nets on Graphics Processing Units

Conference: MBMV 2025 - 28. Workshop
03/11/0000 - 03/12/2025 at Rostock, Germany

Proceedings: ITG-Fb. 320: MBMV 2025

Pages: 9Language: englishTyp: PDF

Authors:
Groth, Stefan; Heidorn, Christian; Schmid, Moritz; Teich, Jürgen; Hannig, Frank

Abstract:
Neural Architecture Search (NAS) is a crucial process for finding novel neural networks for certain tasks. For real-time applications, it is beneficial to incorporate latency constraints already into the search process. One common way to realize NAS is to use multi-objective Bayesian optimization to find candidate neural architectures, which are then trained. Because training models is very time-consuming, we propose an approach that restricts search spaces using a method to estimate the inference time of neural architectures, thus only considering models that are below or close to the required inference time. We show that our approach reduces the time to evaluate points in the search space and, therefore, the whole NAS by multiple orders of magnitude while finding neural architectures of similar quality. Furthermore, we evaluate our approach using a zero-shot proxy that indicates a model’s quality without training the model. Here, we not only find the best architecture given the zero-shot proxy, but also reason about the limitations of the zero-shot proxy using our approach.