PERFORMANCE EVALUATION OF MACHINE LEARNING ALGORITHMS FOR NETWORK INTRUSION DETECTION WITH FEATURES COMBINATION

GUPTA, HARSHITA; GAUTAM, AMAN

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20372

Title:	PERFORMANCE EVALUATION OF MACHINE LEARNING ALGORITHMS FOR NETWORK INTRUSION DETECTION WITH FEATURES COMBINATION
Authors:	GUPTA, HARSHITA GAUTAM, AMAN
Keywords:	MACHINE LEARNING ALGORITHMS SHELL NETWORK INTRUSION DETECTION FEATURES COMBINATION K-NEAREST NEIGHBORS CYBERSECURITY
Issue Date:	May-2023
Series/Report no.:	TD-6791;
Abstract:	The prevalence of cyber-attacks in today's digital landscape has created a pressing need for the development of effective intrusion detection systems. Among the various approaches available, machine learning algorithms have emerged as a promising solution in this domain. This research focuses on investigating the effectiveness of three popular machine learning algorithms, namely K-Nearest Neighbors (KNN), Decision Tree, and Random Forest, for network intrusion detection. To evaluate the performance of these algorithms, a dataset comprising both normal network traffic data and intrusion data was collected. The normal data was obtained from Wireshark, a widely used network protocol analyzer, while the intrusion data was sourced from the Canadian Institute for Cybersecurity. This diverse dataset allows for a comprehensive assessment of the algorithms' capabilities in identifying and classifying network intrusions. To ensure a robust evaluation, the dataset was divided into separate training and testing sets using the Scikit-learn library. This division enables the algorithms to be trained on a portion of the data and then evaluated on unseen instances to assess their generalization and predictive abilities. By employing KNN, Decision Tree, and Random Forest algorithms on the training data, the researchers can analyze their performance on the testing data. To measure the accuracy of each algorithm, a cross-validation approach was employed. Cross-validation accuracy provides a reliable estimate of the algorithms' performance by repeatedly partitioning the dataset into training and validation subsets. This technique helps mitigate the impact of dataset bias and provides a more robust evaluation metric. In addition to evaluating the algorithms individually, the researchers explored the impact of combining different traffic features on the accuracy of intrusion detection. By grouping the features in pairs, triplets, and larger combinations, they were able to assess the influence of feature selection and combination techniques on the algorithms' performance. This analysis provides valuable insights into the interplay between various traffic features and the effectiveness of the algorithms in detecting intrusions. The experimental results revealed that the highest accuracy achieved was an impressive 98.80%, obtained through the combination of two traffic features. This finding underscores the importance of feature selection and combination techniques in enhancing the accuracy of intrusion detection algorithms. By v carefully selecting and combining relevant features, the algorithms can extract more meaningful patterns from the data and improve their ability to differentiate between normal and malicious network activity. Furthermore, this research emphasizes the significance of using appropriate datasets for training and testing purposes. The utilization of Wireshark data for normal network traffic and intrusion data from the Canadian Institute for Cybersecurity enhances the realism and relevance of the evaluation. By leveraging authentic and representative datasets, the researchers ensure that the algorithms are exposed to real-world scenarios and can effectively detect various types of cyber-attacks.The findings of this study have practical implications for the development of more robust intrusion detection systems. The insights gained from evaluating the performance of machine learning algorithms, as well as the importance of feature selection and dataset quality, can inform the design and implementation of advanced systems to safeguard against cyber-attacks. By leveraging the knowledge gained in this research, organizations and security practitioners can enhance their ability to detect and mitigate network intrusions, thereby bolstering the overall cybersecurity posture.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/20372
Appears in Collections:	M Sc Applied Maths

Files in This Item:

File	Description	Size	Format
Harshita & Aman M.Sc..pdf		557.15 kB	Adobe PDF	View/Open

Show full item record