SC Seminar: Shalini Shalini

Shalini Shalini, TU Kaiserslautern

Title: Sparse Deep Neural Network and Hyperparameter Optimization

Abstract:

Despite the considerable success of deep learning in recent years, it is still challenging to deploy state-of-the-art deep neural networks due to the high computational and memory cost. Recent deep learning research has focused on optimally generating sparse neural networks using nonsmooth regularization such as L_1 and L_{2,1} norm. However, the resulting training problem would be nonsmooth, which does not guarantee convergence using the conventional stochastic gradient descent (SGD) approach. A recent solution is the Proximal Stochastic Gradient Descent (ProxSGD) optimizer, which solves this nonsmooth optimization problem and ensures convergence much faster. In practice, the performance
of ProxSGD could be sensitive to the precise setup of internal hyperparameters. The main focus of this thesis is to effectively train sparse neural networks through weight pruning and filter pruning using ProxSGD optimizer and its hyperparameter optimization. A new approach, GSparsity, is introduced for efficient implementation of filter pruning.

Firstly, Bayesian optimization and evolutionary algorithms are used to optimize the hyperparameters of ProxSGD, resulting in a range of hyperparameters that helps in achieving good accuracy and compression rate; for example, with DenseNet-201 on the CIFAR100 dataset, an accuracy of 72.01% is achieved with a compression rate of 27.24x (96.33% of weights are pruned) and with ResNet-56 on CIFAR10, 93% accuracy is achieved by removing 93.51% parameters without any loss in baseline accuracy. Secondly, experiments show that ProxSGD performance (in terms of accuracy and compression rate) improves by finetuning the remaining weights using Adam optimizer with a cosine LR scheduler. Thirdly, the GSparsity approach via ProxSGD is proposed for filter pruning and empirically shows that it achieves new state-of-the-art results for filter pruning.

How to join:

The talk is held online via Zoom. You can join with the link https://uni-kl-de.zoom.us/j/94636397127?pwd=Y1g4dGVFQitzUHVRQUFpcFB4WVFKQT09.

Title: Sparse Deep Neural Network and Hyperparameter Optimization

Abstract:

Upcoming events:

News