Machine learning based IoT Intrusion Detection System : an MQTT case study (MQTT-IoT-IDS2020 Dataset)

Hindy, Hanan and Bayne, Ethan and Bures, Miroslav and Atkinson, Robert and Tachtatzis, Christos and Bellekens, Xavier; Ghita, Bogdan and Shiaeles, Stavros, eds. (2021) Machine learning based IoT Intrusion Detection System : an MQTT case study (MQTT-IoT-IDS2020 Dataset). In: Selected Papers from the 12th International Networking Conference, INC 2020. Lecture Notes in Networks and Systems, LNNS . Springer, GRC, pp. 73-84. ISBN 9783030647582 (

[thumbnail of Hindy-etal-INC2020-Machine-learning-based-IoT-Intrusion-Detection-System]
Text. Filename: Hindy_etal_INC2020_Machine_learning_based_IoT_Intrusion_Detection_System.pdf
Accepted Author Manuscript

Download (416kB)| Preview


The Internet of Things (IoT) is one of the main research fields in the Cybersecurity domain. This is due to (a) the increased dependency on automated device, and (b) the inadequacy of general-purpose Intrusion Detection Systems (IDS) to be deployed for special purpose networks usage. Numerous lightweight protocols are being proposed for IoT devices communication usage. One of the distinguishable IoT machine-to-machine communication protocols is Message Queuing Telemetry Transport (MQTT) protocol. However, as per the authors best knowledge, there are no available IDS datasets that include MQTT benign or attack instances and thus, no IDS experimental results available. In this paper, the effectiveness of six Machine Learning (ML) techniques to detect MQTT-based attacks is evaluated. Three abstraction levels of features are assessed, namely, packet-based, unidirectional flow, and bidirectional flow features. An MQTT simulated dataset is generated and used for the training and evaluation processes. The dataset is released with an open access licence to help the research community further analyse the accompanied challenges. The experimental results demonstrated the adequacy of the proposed ML models to suit MQTT-based networks IDS requirements. Moreover, the results emphasise on the importance of using flow-based features to discriminate MQTT-based attacks from benign traffic, while packet-based features are sufficient for traditional networking attacks.