The Evolution of Knowledge Distillation in Image Classification Tasks
DOI:
https://doi.org/10.61173/c18pw152Keywords:
Knowledge distillation, Computer vision, Model compressionAbstract
In today’s digital age, image classification plays a crucial role as a key task in the field of computer vision. Image classification tasks aim to accurately assign images to predefined categories, but training efficient models for high-precision classification operations using large-scale datasets remains challenging. To this end, researchers have discovered knowledge distillation strategies to achieve model performance compression. Knowledge distillation aims to transform complex models into lightweight ones through parameter optimization, enabling lightweight models to learn the capabilities of complex models from limited data without altering the original model structure, achieving excellent scalability. Knowledge distillation primarily involves the transfer of knowledge through three forms: model outputs, feature map matching, and structural knowledge. This paper primarily analyzes and discusses three aspects: decoupled knowledge distillation and decision boundary interpretation structures with an outputoriented focus, knowledge distillation based on classical feature-based attention transfer and Wasserstein distance, and relationship-based virtual distillation techniques and knowledge distillation that preserves Lipschitz continuity.