Abstract:
Weapons are a critical and serious topic and has become a severe threat to current security
needs. People who bring firearms into airlines, schools, and other secure locations pose a
threat to public safety. In certain regions of the globe, mass shootings and gun violence are on
the increase. These kinds of situations are time sensitive and may result in significant loss of
life and property. Although CCTVs have been employed in many establishments but these
require operators to continuously examines the video streams for weapons. The ability to
identify suspicious activity is proportionate to their attention to each video stream shown on
the screen, thus leading to a high rate of false positives which can become a liability to the
daily operational needs of institutions. Therefore, the requirement for the deployment of video
surveillance systems capable of recognizing firearms automatically has increased and plays an
important role in intelligent monitoring. Several object detection models are available, which
struggle to recognize firearms due to their unique size and form, as well as the varied colours
of the background. This thesis presents a comprehensive literature review of recent visionbased approaches for automated detection of firearms from images and videos. The literature
has broadly been categorized into classic vision/machine learning based approaches and deep
learning based approaches. In this research, we further explored various deep learning
alternatives for accurate fire detection. For region based detection, a deep learning based
weapon detection system employing YOLO v5 for weapon detection that will be sufficiently
resilient in terms of affine, rotation, occlusion, and size. The performance of our system was
evaluated on a publicly available dataset and achieved the F1-score of 95.43%. Instance
segmentation or pixel level segmentation was also performed which employs Mask-RCNN for
the detection and segmentation of firearms. We achieved the detection accuracy (DC) of
90.66% and 88.74% Mean intersection over union (mIoU). The purposed methodology
combined both techniques with different preprocessing methods along with various data
augmentation techniques to improve the efficiency and accuracy of the system.