Download PDFOpen PDF in browserMathematics-Driven Enhancements in Object Detection: a Hybrid Deep Learning FrameworkEasyChair Preprint 155046 pages•Date: November 30, 2024AbstractThis paper explores the mathematical foundation of hybrid object detection models, combining Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). We provide a detailed mathematical formulation for feature extraction, attention mechanisms, and optimization strategies. By integrating advanced regularization techniques and loss functions, we aim to improve accuracy while reducing computational overhead. Key contributions include mathematical derivations for attention-aware convolutional layers and a custom dynamic loss function that balances localization and classification errors. Keyphrases: Algorithms, CNN, ViT, deep learning
|