Conférence
Notice
Lieu de réalisation
Journée GdR IASIS "Attention visuelle : prédiction et applications", INSA Rennes
Langue :
Français
Conditions d'utilisation
Droit commun de la propriété intellectuelle
DOI : 10.60527/qbw1-zc25
Citer cette ressource :
GdR IASIS. (2024, 23 mai). Attention-guided Dynamic inference for model compression. [Vidéo]. Canal-U. https://doi.org/10.60527/qbw1-zc25. (Consultée le 25 avril 2025)

Attention-guided Dynamic inference for model compression

Réalisation : 23 mai 2024 - Mise en ligne : 8 octobre 2024
  • document 1 document 2 document 3
  • niveau 1 niveau 2 niveau 3
Descriptif

Attention models have recently gained popularity in Machine Learning with Transformer architectures, dominating Natural Language Processing (NLP) and challenging CNN-based neural network architectures in computer vision tasks. This is due to the self-attention mechanism, i.e., the building block of Transformers, which assigns importance weights to different regions in the input sequence, enabling the model to focus on relevant information for each prediction. Recent work leverages the inherent attention mechanism in Transformers for complexity reduction of models in image classification. Precisely, it uses attention to focus on the most important information in input images, which allows us to allocate computation to this salient spatial locations only. The motivation for compressing neural networks stems from their computational complexity, which is quadratic (O(N2)) in the case of Transformers, where N is the number of input tokens, and memory requirements, which hinder their efficiency in terms of energy consumption. In our case, we explore a novel approach named dynamic compression, which aims to reduce complexity during inference by dynamically allocating resources based on each input sample. Through a preliminary study, we observed that Transformers exhibit a suboptimal image partitioning into tokens, which shows that small models (less tokens) classify a set of images better than bigger models (more tokens) for image classification task.