Super Convergence Cosine Annealing with Warm-Up Learning Rate

Konferenz: CAIBDA 2022 - 2nd International Conference on Artificial Intelligence, Big Data and Algorithms
17.06.2022 - 19.06.2022 in Nanjing, China

Tagungsband: CAIBDA 2022

Seiten: 7Sprache: EnglischTyp: PDF

Autoren:
Liu, Zhao (University of California, San Diego, USA)

Inhalt:
Choosing an appropriate learning rate for deep neural networks is critical to getting a good performance. Though optimizers such as RMSprop, AdaGrad, and Adam can adjust the learning adaptively, SGD with fine-tuned learning rate gives better results in most cases. In this paper, we present a learning rate schedule called Super Convergence Cosine Annealing with Warm-Up (SCCA) that increases the learning rate to a relatively large value and decreases using cosine annealing to converge fast and give the best performance for deep neural networks. We will demonstrate the results of learning rate schedules on different architectures ResNet, ResNeXt, GoogLeNet, and VGG, and on datasets CIFAR-10 and CIFAR-100. SCCA improves test accuracies by about 2% on CIFAR-10 and 5% on CIFAR-100.