Abstract
Convolutional Neural Networks (CNNs) achieve
state-of-the-art performance across various application domains
but are often resource-intensive, limiting their use on resourceconstrained devices. Low-rank factorization (LRF) has emerged as
a promising technique to reduce the computational complexity and
memory footprint of CNNs, enabling efficient deployment without
significant performance loss. However, challenges still remain
in optimizing the rank selection problem, balancing memory
reduction and accuracy, and integrating LRF into the training
process of CNNs. In this paper, a novel and generic methodology
for layer-wise rank selection is presented, considering inter-layer
interactions. Our approach is compatible with any decomposition
method and does not require additional retraining. The proposed
methodology is evaluated in thirteen widely-used, CNN models,
significantly reducing model parameters and Floating-Point Operations (FLOPs). In particular, our approach achieves up to a
94.6% parameter reduction (82.3% on average) and up to 90.7%
FLOPs reduction (59.6% on average), with less than a 1.5% drop
in validation accuracy, demonstrating superior performance and
scalability compared to existing techniques.
state-of-the-art performance across various application domains
but are often resource-intensive, limiting their use on resourceconstrained devices. Low-rank factorization (LRF) has emerged as
a promising technique to reduce the computational complexity and
memory footprint of CNNs, enabling efficient deployment without
significant performance loss. However, challenges still remain
in optimizing the rank selection problem, balancing memory
reduction and accuracy, and integrating LRF into the training
process of CNNs. In this paper, a novel and generic methodology
for layer-wise rank selection is presented, considering inter-layer
interactions. Our approach is compatible with any decomposition
method and does not require additional retraining. The proposed
methodology is evaluated in thirteen widely-used, CNN models,
significantly reducing model parameters and Floating-Point Operations (FLOPs). In particular, our approach achieves up to a
94.6% parameter reduction (82.3% on average) and up to 90.7%
FLOPs reduction (59.6% on average), with less than a 1.5% drop
in validation accuracy, demonstrating superior performance and
scalability compared to existing techniques.
Original language | English |
---|---|
Title of host publication | DATE 2025 Conference |
Publication status | Published - 2 Apr 2025 |
Event | DATE 25 (Design, Automation and Test in Europe) - Lyon, France Duration: 31 Mar 2025 → 2 Apr 2025 |
Conference
Conference | DATE 25 (Design, Automation and Test in Europe) |
---|---|
Country/Territory | France |
City | Lyon |
Period | 31/03/25 → 2/04/25 |