Ieracitano C, Adeel A, Gogate M, Dashtipour K, Morabito FC, Larijani H, Raza A & Hussain A (2018) Statistical Analysis Driven Optimized Deep Learning System for Intrusion Detection. In: Ren J, Hussain A, Zheng J, Liu C, Luo B, Zhao H & Zhao X (eds.) Advances in Brain Inspired Cognitive Systems. BICS 2018. Lecture Notes in Computer Science, 10989. BICS 2018: International Conference on Brain Inspired Cognitive Systems, Xi'an, China, 07.07.2018-08.07.2018. Cham, Switzerland: Springer Verlag, pp. 759-769. https://doi.org/10.1007/978-3-030-00563-4_74
Abstract Attackers have developed ever more sophisticated and intelligent ways to hack information and communication technology (ICT) systems. The extent of damage an individual hacker can carry out upon infiltrating a system is well understood. A potentially catastrophic scenario can be envisaged where a nation-state intercepting encrypted financial data gets hacked. Thus, intelligent cybersecurity systems have become inevitably important for improved protection against malicious threats. However, as malware attacks continue to dramatically increase in volume and complexity, it has become ever more challenging for traditional analytic tools to detect and mitigate threat. Furthermore, a huge amount of data produced by large networks have made the recognition task even more complicated and challenging. In this work, we propose an innovative statistical analysis driven optimized deep learning system for intrusion detection. The proposed intrusion detection system (IDS) extracts optimized and more correlated features using big data visualization and statistical analysis methods, followed by a deep autoencoder (AE) for potential threat detection. Specifically, a preprocessing module eliminates the outliers and converts categorical variables into one-hot-encoded vectors. The feature extraction module discards features with null values grater than 80% and selects the most significant features as input to the deep autoencoder model trained in a greedy-wise manner. The NSL-KDD dataset (an improved version of the original KDD dataset) from the Canadian Institute for Cybersecurity is used as a benchmark to evaluate the feasibility and effectiveness of the proposed architecture. Simulation results demonstrate the potential of our proposed IDS system for improving intrusion detection as compared to existing state-of-the-art methods.
Keywords Cybersecurity; Deep learning; Auroencoder; NSL-KDD dataset