Real Time Object Detection Classification Using Cnn Algorithm
Keywords:
YOLO, CNN, Real-time object detection, Deep Learning, RNN.Abstract
Real-time object detection and classification have become essential in various applications such as surveillance, autonomous vehicles, and smart systems. This paper presents a robust approach for real-time object detection and classification using Convolutional Neural Networks (CNN). The proposed method leverages deep learning techniques to automatically extract spatial features from input images and accurately identify objects within dynamic environments. A pre-trained CNN model is fine-tuned and integrated with a detection framework to achieve high accuracy and low latency. The system is evaluated on standard datasets, demonstrating improved performance in terms of precision, recall, and processing speed compared to traditional methods. The results indicate that CNN-based models provide an efficient and scalable solution for real-time object detection tasks in complex scenarios.
References
Chen, Y., Zhang, H., Wang, L., & Wu, Q. (2023). A survey on visionbased UAV navigation. IEEE Transactions on Intelligent Transportation Systems, 24(3), 2835-2854. https://doi.org/10.1109/TITS.2022.3224325
Zhang, Y., Yuan, Y., Feng, Y., & Lu, X. (2023). UAV swarms in urban environments: A survey of emerging trends and challenges. Drones, 7(2), 98. https://doi.org/10.3390/drones7020098
Li, K., Ni, W., Tovar, E., & Jamalipour, A. (2023). On-board deep learning in UAV-based applications: Opportunities and challenges. IEEE Internet of Things Journal, 10(4), 3553-3575. https://doi.org/10.1109/JIOT.2022.3215698
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.2010.11929
Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., & Shah, M. (2022). Transformers in vision: A survey. ACM Computing Surveys, 54(10s), 1-41. https://doi.org/10.1145/3505244
Shi, W., Cao, J., Zhang, Q., Li, Y., & Xu, L. (2016). Edge computing: Vision and challenges. IEEE Internet of Things Journal, 3(5), 637-646. https://doi.org/10.1109/JIOT.2016.2579198
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 779-788). https://doi.org/10.1109/CVPR.2016.91
Zhang, Y., Yuan, Y., Feng, Y., & Lu, X. (2021). Cascade det: A universal cascade detection framework for small objects in dronecaptured scenes. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-13. https://doi.org/10.1109/TGRS.2021.3138849
Xu, Z., Wu, W., Qi, L., & Lu, X. (2023). Efficient vision transformers for edge devices: A survey. IEEE Access, 11, 28456- 28478. https://doi.org/10.1109/ACCESS.2023.3258741
Rizk, Y., Awad, M., & Tunstel, E. W. (2019). Cooperative heterogeneous multi-robot systems: A survey. ACM Computing Surveys, 52(2), 1-31. https://doi.org/10.1145/3303848
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (Vol. 25, pp. 1097-1105). https://doi.org/10.1145/3065386
Mishra, R. (2024). Raspberry Pi Performance analysis across its Operating System in LED Control Operation. International Journal of Advanced Research and Multidisciplinary Trends (IJARMT), 1(2), 01-11.
Mishra, R. (2025). IOT and DSP (combination of hardcore Virtex-5 FPGA and soft core DSP processor) OFDM System PAPR Reduction Using Artificial Intelligence Algorithm. International Journal of Advanced Research and Multidisciplinary Trends (IJARMT), 2(1), 135-149.
Mishra, R., & Sharma, A. (2026). Enhanced Trajectory Tracking of a 6-DOF Robotic Manipulator Using GA–PID and ANN–PID Controllers. International Journal of Research & Technology, 14(2), 53-70.
Vaswani, A., Ramachandran, P., Srinivas, A., Parmar, N., Hechtman, B., & Shlens, J. (2021). Scaling local self-attention for parameter efficient visual backbones. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 12894-12904). https://doi.org/10.1109/CVPR46437.2021.01270
Downloads
How to Cite
Issue
Section
License
Copyright (c) 2026 International Journal of Engineering Science & Humanities

This work is licensed under a Creative Commons Attribution 4.0 International License.


