Conférence

Notice

Lieu de réalisation

Journée GdR IASIS "Attention visuelle : prédiction et applications", INSA Rennes

Langue :

Français

Conditions d'utilisation

Droit commun de la propriété intellectuelle

DOI : 10.60527/qbw1-zc25

Citer cette ressource :

GdR IASIS. (2024, 23 mai). Attention-guided Dynamic inference for model compression. [Vidéo]. Canal-U. https://doi.org/10.60527/qbw1-zc25. (Consultée le 27 juin 2025)

Attention-guided Dynamic inference for model compression

Réalisation : 23 mai 2024 - Mise en ligne : 8 octobre 2024

document 1 document 2 document 3
niveau 1 niveau 2 niveau 3

Descriptif

Attention models have recently gained popularity in Machine Learning with Transformer architectures, dominating Natural Language Processing (NLP) and challenging CNN-based neural network architectures in computer vision tasks. This is due to the self-attention mechanism, i.e., the building block of Transformers, which assigns importance weights to different regions in the input sequence, enabling the model to focus on relevant information for each prediction. Recent work leverages the inherent attention mechanism in Transformers for complexity reduction of models in image classification. Precisely, it uses attention to focus on the most important information in input images, which allows us to allocate computation to this salient spatial locations only. The motivation for compressing neural networks stems from their computational complexity, which is quadratic (O(N2)) in the case of Transformers, where N is the number of input tokens, and memory requirements, which hinder their efficiency in terms of energy consumption. In our case, we explore a novel approach named dynamic compression, which aims to reduce complexity during inference by dynamically allocating resources based on each input sample. Through a preliminary study, we observed that Transformers exhibit a suboptimal image partitioning into tokens, which shows that small models (less tokens) classify a set of images better than bigger models (more tokens) for image classification task.

Thème

Disciplines :