Abstract:
Internet of Things (IoT) gadgets have made networks vulnerable to Distributed Denial-of-Service (DDoS) attacks, especially SYN flood attacks, that can severely impact services. The proposed project outlines a general machine learning viewpoint on identifying SYN flood intrusions into the IoT organization through a big data traffic set that includes IP addresses, port numbers, TCP flags, and time-based metadata. The preprocessing of the dataset of more than 1 GB was conducted to eliminate the formatting errors, such as European decimal separators and non-number entries. The exploratory data analysis (EDA) was performed to graphically display traffic distribution and correlation between TCP flags and the packet lengths. Several machine learning tables, which include Random Forest, XGBoost, and Logistic Regression were trained and measured based on measures of Accuracy, F1 Score, and ROC-AUC. One of them, the Random Forest classifier, performed the best with accuracy above 99, which is a great indication in that it can be easily used in the proposed system in real-time anomaly detection. The analysis of feature importance showed that SYN, ACK, and TTL flags had a strong impact on the model prediction, which was not surprising given established trends in the SYN flood behavior patterns. This solution demonstrates the effectiveness of supervised learning to intrusion detection in sophisticated IoTs. The findings do not only confirm the efficacy of the ensemble learning techniques in real world IoT traffic, but also they provide a scalable pipeline to future research in the intelligent threat mitigation. The suggested model and preprocessing framework are a step to create resilient IoT infrastructures that can be adaptive, efficient, and detect threats early.