Article

Group Sparse Regularization for Deep Neural Networks

Details

Citation

Scardapane S, Comminiello D, Hussain A & Uncini A (2017) Group Sparse Regularization for Deep Neural Networks. Neurocomputing, 241, pp. 81-89. https://doi.org/10.1016/j.neucom.2017.02.029

Abstract
In this paper, we address the challenging task of simultaneously optimizing (i) the weights of a neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are traditionally dealt with separately, we propose an efficient regularized formulation enabling their simultaneous parallel execution, using standard optimization routines. Specifically, we extend the group Lasso penalty, originally proposed in the linear regression literature, to impose group-level sparsity on the network's connections, where each group is defined as the set of outgoing weights from a unit. Depending on the specific case, the weights can be related to an input variable, to a hidden neuron, or to a bias unit, thus performing simultaneously all the aforementioned tasks in order to obtain a compact network. We carry out an extensive experimental evaluation, in comparison with classical weight decay and Lasso penalties, both on a toy dataset for handwritten digit recognition, and multiple realistic mid-scale classification benchmarks. Comparative results demonstrate the potential of our proposed sparse group Lasso penalty in producing extremely compact networks, with a significantly lower number of input features, with a classification accuracy which is equal or only slightly inferior to standard regularization terms.

Keywords
Deep networks; Group sparsity; Pruning; Feature selection

Journal
Neurocomputing: Volume 241

StatusPublished
Publication date07/06/2017
Publication date online10/02/2017
Date accepted by journal07/02/2017
URLhttp://hdl.handle.net/1893/24942
PublisherElsevier
ISSN0925-2312