Unveiling the IoT's dark corners: anomaly detection enhanced by ensemble modelling

The growing Internet of Things (IoT) landscape requires robust security; traditional rule-based systems are insufficient, driving the integration of machine learning (ML) for effective intrusion detection. This paper provides an inclusive overview of research efforts focused on harnessing ML methodologies to fortify intrusion detection within IoT. Tailored feature extraction techniques are pivotal for achieving high detection accuracy while minimizing false positives. The study employs the IoT23 dataset from Kaggle and incorporates four optimization algorithms – Particle Swarm Optimizer, Whale-Pearson optimization algorithm, Harris-Hawks Optimizer, and Support Vector Machine with Particle Swarm optimization algorithm (SVM-PSO) – for feature extraction and selection. A comparison with ML algorithms such as logistic regression, decision tree and naïve Bayes classifier highlights Harris-Hawks Optimizer as the most effective. Furthermore, ensemble methods, particularly the fusion of random forest with HHO optimization, yield an impressive accuracy of 99.97%, surpassing AdaBoost and XGBoost approaches. This paper underscores the application of diverse ensemble learning techniques to enhance intrusion detection precision and efficiency within the intricate IoT landscape, effectively tackling the challenges posed by its complex and ever-changing nature.


Introduction
The IoT revolutionizes technology interaction by interconnecting physical objects through sensors and internet connectivity.This enables data exchange and automation across sectors like healthcare [1], transportation, agriculture and smart cities, as shown in Figure 1.Wearables and connected devices enhance healthcare monitoring, while smart transportation systems improve traffic management.Agriculture benefits from IoT-driven precision, and smart cities optimize services.Despite these benefits, IoT faces security [2], privacy and interoperability challenges.Robust cybersecurity, privacy regulations and standardized approaches are vital.The future holds 5G-enabled IoT, edge computing and AI-driven analytics, reshaping industries and daily tech interactions.

Security issues in IoT
Security issues in IoT stem from the vast interconnectedness of devices, raising concerns about data privacy, unauthorized access and potential breaches [3].Vulnerabilities arise due to diverse device types and communication protocols, often lacking robust security measures.Without proper safeguards, IoT devices can be exploited, leading to compromised personal data, disruption of services and even broader cyber threats.

Authentication, data privacy and encryption
Security issues in the IoT ecosystem stem from its vast network of interconnected devices.One major concern is weak authentication and authorization mechanisms [4].Many devices come with default credentials or lack proper authentication, making them susceptible to unauthorized access.Additionally, data privacy is a significant worry.IoT devices collect a wealth of personal and sensitive data, raising concerns about how this data is stored, transmitted and utilized.Encryption [5] is essential to protect data from interception during transmission and storage, but its implementation varies widely across IoT devices.Ensuring strong encryption protocols and secure key management is crucial to maintain data confidentiality.

Software vulnerabilities, lack of updates and device management
Software security is a significant challenge in IoT.Many devices run on outdated or unpatched software, leaving them vulnerable to known exploits [6].Regular software updates are essential to address vulnerabilities and improve overall security, but IoT devices often lack robust mechanisms for applying updates.Additionally, the diversity of devices and manufacturers makes ensuring consistent and timely updates challenging.Device management is another issue; managing a CONTACT Jisha Jose jishajose.cse@gmail.com;jisha.jose@mbcet.ac.in,Department of Computer Science and Engineering, Noorul Islam Centre for Higher Education, Kumarakovil, Thuckalay, Kanyakumari District, Tamil Nadu 629180, India vast number of devices and ensuring they are all properly configured and updated is complex and resourceintensive [7].Inadequate device management can lead to security gaps that attackers can exploit.

Interoperability, supply chain risks and regulatory gaps
Interoperability between different IoT devices and protocols is a significant concern.Incompatibilities between devices and protocols can introduce vulnerabilities that attackers might exploit to gain unauthorized access or disrupt communication.Moreover, the global supply chain for IoT components introduces risks.Compromised or counterfeit components can find their way into devices, potentially enabling backdoors or other security vulnerabilities.Regulatory gaps are also apparent, with different regions having varying levels of legislation and standards for IoT security.A lack of consistent regulations can result in varying security practices across devices, leaving some more vulnerable than others.Table 1 shows the merits and demerits of IoT.
Addressing these security issues requires a collaborative effort involving manufacturers, service providers, policymakers and end-users.Industry-wide standards for security practices, strong authentication mechanisms, robust encryption protocols, regular software updates and comprehensive device management are essential steps to build a more secure IoT landscape.As IoT continues to evolve, proactive security measures must evolve with it to ensure the potential benefits of this interconnected ecosystem are not overshadowed by security risks.

Lack of security
Reduced use of many electronic devices as one device does the job of a lot of other devices Absence of international standards for better communication

Intrusion detection system
In the expansive landscape of the IoT, cybersecurity is of utmost importance due to the seamless communication between devices [8].Intrusion detection, a vital component of cybersecurity, plays a pivotal role in safeguarding IoT networks against unauthorized access and malicious activities.IDS are integral in monitoring network traffic and system behaviour, quickly identifying suspicious actions that could compromise the security and integrity of IoT devices and data [9].The dynamic and diverse nature of IoT devices presents challenges for traditional security measures, making tailored intrusion detection mechanisms essential.ML and artificial intelligence techniques are being leveraged to create adaptive, context-aware IDS that can learn normal device behaviour and swiftly detect deviations, providing effective defense against a wide range of threats [10].
As the cybersecurity landscape intensifies, the importance of IDS grows significantly.Organizations face various attacks like malware [11], ransomware and data breaches, highlighting the need for vigilant defense mechanisms.IDS continually monitor network traffic [12], system behaviour and data access patterns, alerting security personnel to any anomalies.They can also uncover unusual patterns indicative of zero-day vulnerabilities, offering early warnings and reducing risks.Complying with regulations and industry standards, IDS play a pivotal role in meeting security requirements and upholding data integrity.Ultimately, IDSs offer proactive protection by strengthening defenses, identifying breaches and preserving sensitive information in a rapidly evolving cyber environment.

Deep learning approaches
Roopak et al. [13] introduced innovative deep learning models aimed at enhancing the cybersecurity of IoT networks.Despite the rapid expansion of IoT technology, its vulnerability to cyber threats remains a significant concern.The paper addressed this issue by presenting deep learning solutions for IoT network security.Notably, the growing frequency of DDoS attacks on IoT networks is highlighted as a major threat.The proposed models are rigorously evaluated using the CICIDS2017 dataset, demonstrating an impressive accuracy rate of 97.16% in detecting DDoS attacks.A comparative analysis is conducted against conventional ML algorithms.
Verma et al. [14] delves into the feasibility of employing ML classification algorithms to bolster the security of IoT networks against DoS attacks.Through an extensive investigation, the study focuses on enhancing the development of anomaly-based IDS.Key metrics and validation approaches are used to evaluate the effectiveness of these models.Noteworthy datasets like CIDDS-001, UNSW-NB15 and NSL-KDD serve as benchmarks for classifier assessment.Statistical tests such as Friedman and Nemenyi are utilized to scrutinize significant differences between classifiers.The study incorporates Raspberry Pi to gauge classifier response times within the context of IoT hardware.In order to develop IoT security measures, the article also provides a method for choosing the best classifier depending on specific needs.
Otoum et al. [15] introduced a novel Deep Learningbased IDS tailored for IoT environments.This innovative system leverages a combination of the Spider Monkey Optimization algorithm (SMO) and the Stacked-Deep Polynomial Network (SDPN) to achieve heightened accuracy in detecting security threats.SMO is employed for optimal feature selection within the datasets, while SDPN is responsible for classifying data into normal and anomaly categories.The DL-ID system is capable of identifying a range of anomalies, including Denial of Service (DoS), User to Root (U2R) attacks, probe attacks and Remote to Local (R2L) attacks.By amalgamating these advanced techniques, the proposed DL-ID system showcases potential for enhanced intrusion detection accuracy in IoT environments, thereby contributing to heightened cybersecurity in this dynamic and interconnected landscape.

Machine learning approaches
Mahmood et al. [16] addressed challenges arising from the implementation of IoT systems and proposes solutions through ML techniques.It focuses on an RFID system, crucial for IoT, comparing various technologies to select optimal ones based on functionality and security.Using a prototype IoT system exemplified by baggage tracking at an airport, the research highlights five main differences between IoT and traditional systems: technical limitations of IoT devices, the significant influence of the physical environment, inadequate security focus during design, susceptibility of IoT devices to attacks like DDoS, and heightened privacy sensitivity of IoT use cases.The study utilizes the KDD Cup 1999 dataset, a renowned IoT and cybersecurity dataset, for training, testing and validation purposes, utilizing the MATLAB R2019a software.By identifying challenges and implementing solutions, the paper contributes to enhancing the understanding and effective implementation of IoT systems while addressing critical security and privacy concerns.
Saheed et al. [17] addressed the challenge by proposing a ML-based IDS for IoT network attacks.It focuses on applying supervised ML algorithms to detect attacks, employing feature scaling and dimensionality reduction techniques.Six ML models were tested on the UNSW-NB15 dataset, containing various attack types and normal activities.Experimental results, including accuracy (99.9%),MCC (99.97%), and other metrics, demonstrated the effectiveness of the ML-IDS.The paper contributes to enhancing IoT security and privacy by utilizing ML approaches for robust intrusion detection, thereby mitigating the challenges posed by IoT device limitations and network scalability.

Hybrid approaches
Sahu et al. [18] introduced an innovative security framework and attack detection mechanism centred around a Deep Learning model to effectively identify malicious devices.This approach addresses existing gaps by utilizing a Convolutional Neural Network (CNN) to extract precise feature representations from data, followed by classification through a Long Short-Term Memory (LSTM) Model.The experimental evaluation employs a dataset collected from twenty compromised IoT devices utilizing Raspberry Pi.Notably, the study demonstrates impressive empirical results, with a 96% accuracy rate for detecting attacks.By leveraging the combined power of CNN and LSTM, the proposed mechanism offers a promising solution for enhancing the detection of malicious activities in IoT environments, contributing to heightened security and reliability in the rapidly expanding IoT landscape.Kumar et al. [19] presented an intelligent cyber-attack detection system customized for IoT networks using a novel hybrid feature reduction technique.This method involves three key steps: initiating feature ranking, random forest means decrease accuracy, gain ratio, resulting in distinct sets of features.These sets are combined using a specialized mechanism known as the AND operation to create a singular optimized feature set.This condensed feature set is then inputted into three well-established ML algorithms -random forest, K-nearest neighbour and XGBoost -to identify cyber-attacks.The efficacy of the proposed framework is assessed using established datasets like NSL-KDD, as well as contemporary IoT-centric datasets like BoT-IoT and DS2OS.By adopting this strategy, the paper advances the field of intelligent cyber-attack detection in IoT networks, enhancing security through refined feature selection and robust ML algorithms.

Materials and method
This section provides a comprehensive overview of the foundational components driving the study.The IoT 23 dataset, formatted as a CSV file, forms the basis for the investigation's effectiveness.The study capitalizes on four distinct optimization algorithms: PSO [20], WOA [21], Harris-Hawks Optimizer [22] and SVM-PSO.Employing these approaches, various types of features are extracted from the dataset, with ML techniques assessing the most effective feature set via logistic regression [23], decision tree classifier [24] and naïve Bayes classifier [25].This assessment determines that the HHO algorithm yields the optimal feature selection.To further bolster intrusion detection, the study integrates three ensemble models: Adaboost classifier [26], XG Boost classifier [27] and random forest classifier [28].The proposed system's schematic is presented in Figure 2, illustrating the sequence of operations encompassing the dataset, optimization techniques, ML methods and ensemble models.

Dataset description
Curated for rigorous study, the IoT 23 dataset represents a substantial compilation meticulously structured to propel research and innovations in IoT security.
Comprising diverse and realistic IoT network traffic, this dataset plays a pivotal role in enabling the development and evaluation of IDS and cybersecurity solutions.It encompasses various attack scenarios and normal activities, enhancing its utility for training and testing ML models.The IoT 23 dataset is crucial for fostering a deeper understanding of the evolving threat landscape within IoT environments and for fostering innovation in cybersecurity measures to ensure the integrity and privacy of interconnected devices and systems.

Data preprocessing
Data preprocessing of the dataset involves preparing the CSV data for analysis.Initially, the dataset is loaded using pandas, and exploratory data analysis is conducted to understand its structure and identify issues.Missing values are addressed by either removing or imputing them, categorical variables are transformed using techniques like one-hot encoding, and feature scaling.The data is split into training, and test sets, and the preprocessed data for future use.This comprehensive process ensures the dataset is cleansed, transformed and structured in a way that optimizes its usability.The dataset consists of 21 features which are described in the Table 2.
Figure 3 illustrates the visual representation of the dataset, where 0 corresponds to the count of instances classified as non-malicious cases, and 1 signifies the count of instances categorized as malicious cases.

Particle swarm optimization
PSO is a computational optimization technique inspired by the social behaviour of birds or fish.In PSO, a population of potential solutions (particles) navigates through a search space to find the optimal solution for a given problem.Researchers commonly simplify PSO algorithm as a random search challenge in a space with As particle k explores the D-dimensional space, it commences from a set of randomly positioned particles, gradually converging towards an optimal solution through iterative processes.During the ongoing particle search, the self-found optimal position p k = (p k1 , p k2 , . . ., p kD ) T serves as the local optimal solution, characterized by its associated velocity vector v k = (v k1 , v k2 , . . ., v kd ) T. In contrast, the global optimal solution is represented by the optimal position P g = (P g1 , P g2 , . . ., P gd ) T, which is established by the entire particle swarm's collective search.Throughout each iteration, a particle adjusts both its position and velocity based on the tracking of two optimal solutions, namely (P i, P g ).The update mechanism follows a formula as shown in Equations ( 2) and (3).
Here, N denotes the complete count of particles within the population, and d signifies the specific d-th dimension of particle k.T represents the current iteration number, while ω stands for a non-negative inertia factor that governs the balance between global and local optimization capacities.Higher values of ω amplify global optimization while diminishing local optimization strength, and the reverse holds true.The PSO algorithm is structured as outlined below.The PSO algorithm strategically identifies and selects five features from the given dataset.Through iterative optimization, PSO effectively evaluates numerous combinations of features to determine the optimal subset.By leveraging its swarm intelligence-inspired mechanism, PSO hones in on the most relevant attributes that contribute significantly to the study's objectives.Table 3 tabulates the features being selected by the PSO algorithm.

Whale-Pearson optimization algorithm
The Whale Pearson optimization algorithm represents an enhanced iteration of the Binary Whale swarm algorithm [29].This new version incorporates the concept of simulated annealing for updating the positions.The foundation of the Whale optimization algorithm is rooted in imitating the foraging movements of whales.The revised algorithm maintains the fundamental stages of its forerunner for the exploration procedure.However, the original position updation mechanism has been substituted with an innovative correlationbased selection algorithm.This new method integrates both correlation and classifier-guided fitness evaluations, categorizing it as an embedded selection approach.The operational mechanics of the novel correlation design are elucidated in the subsequent discussion.
Assume X o represents the local optimal solution achieved from the Binary Whale wrapper at an iteration's conclusion.The process of updating positions relies on X o and a predetermined maximum iteration count.This function produces I p random solutions, each of which undergoes correlation evaluation using Pearson correlation method.In this context, f a represents the attribute class, and x j signifies the feature attribute where j spans from 0 to T. The mean correlation between the features and the class attribute is computed using Equations ( 4) and (5).
Equation ( 5) involves x, the input and y symbolizing the attribute that holds the output classification.By applying the Mutation function to the prevailing Gbest solution, a set of I p random position vectors is generated.Each vector is assessed using the unique correlation-based objective function outlined in 3. The most optimal solution among the obtained set is adopted as the present position, and the quest for finding food persists until the predefined maximum iteration count is reached.The Whale Swarm Wrapper technique yields a smaller set of selected features in comparison to the PSO approach.This suggests that the Whale Swarm Wrapper prioritizes a more focused subset of attributes from the dataset.The contrast in the number of selected features underscores the distinct feature evaluation strategies employed by the two algorithms, potentially highlighting the different ways they assess feature importance and relevance.Table 4 shows the features selected by the WOA algorithm.

Harris-Hawks optimizer (HHO)
The population-based Harris' Hawks Optimization (HHO) method [30], harnesses the cooperative behaviour exhibited by groups of Harris' hawks, along with their distinct hunting tactics such as pursuing prey, establishing blockades, and executing surprise dives.The algorithm operates within two primary phases: exploration, where potential prey is identified, and exploitation, which involves strategizing attacks, including blockades and surprise dives.The algorithm involves several steps.First, it estimates the population vector of hawks and calculates their fitness values, along with identifying the best position vector for the prey.Following this, it proceeds to modify the initial energy (E 0 ) and the resistance strength (J) of the prey, along with adjusting its escaping energy, during every iteration.These updates are performed using Equations ( 7)-( 9).This approach allows the algorithm to dynamically adapt and refine its tactics to optimize the search process for improved performance.
The exploration phase is characterized by achieving a prey escaping energy value greater than 1.During this phase, the hawk position vector is iteratively updated using Equation ( 9) to determine its blockade position.Xm(t) signifies average population of hawk, UB and LB denote upper and lower bounds, representing the best-positioned and least-fit hawk in iteration t.In the exploitation phase, four modes are distinguished: Soft blockade: The escaping energy and unsuccessful escape chance exceed 0.5.The victim tyres out due to successive hawk sieges, eventually falling prey to a surprising dive.Hard blockade: Prey's escaping energy is less than 0.5, but its unsuccessful escape chance is better.The prey's energy diminishes, and the hawk hunts it unimpeded, incorporating a surprising dive.Soft blockade (different scenario): Prey's escaping energy surpasses 0.5, yet its successful escape chance is below 0.5.The prey attempts deceptive escape, but the hawks tyre of the ruse and ultimately hunt it down through various blockades and movements.Hard blockade (limited energy): Both parameters fall below 0.5, indicating the prey's lack of energy.
The algorithm further updates the hawk's position vector using Equations (10) to (12).The algorithm concludes after multiple iterations, with the fittest hawk successfully capturing the prey, signifying the termination.
Figure 4 shows Flowchart illustrating HHO process the HHO algorithm excels in feature selection compared to other algorithms.This indicates that HHO adeptly identifies and ranks the most pertinent features from the dataset.Its capability to yield the optimal feature subset underscores its effectiveness in recognizing attributes that significantly contribute to the study's objectives, potentially leading to enhanced model performance.The features selected by the HHO algorithm are shown in Table 5.

Support vector machine with particle swarm optimization algorithm (SVM-PSO)
The SVM kernel employs a technique called the "kernel trick" to address non-linear problems using a linear classifier.This approach transforms data from being linearly inseparable to becoming separable.The kernel function is applied to each data instance, converting the initial non-linear observations into a higherdimensional space where they become separable.This process enhances the SVM's ability to effectively classify complex data [14].
Support Vector Machine with SVM-PSO is a hybrid approach that combines the power of SVM for classification tasks with the optimization capabilities of PSO. Figure 5 shows the Input space to feature space conversion in SVM-PSO using kernel functions.In SVM-PSO, PSO is used to automatically search for the optimal parameters of the SVM algorithm, such as the kernel parameters and regularization parameter.By leveraging PSO's ability to explore and exploit parameter space, SVM-PSO aims to enhance the accuracy and generalization of SVM models by fine-tuning these parameters for improved performance on classification problems.

Machine learning methods
A ML model is a mathematical representation of a problem that learns patterns and relationships from data to make predictions or decisions.It involves selecting an appropriate algorithm, training the model on a labelled dataset, and fine-tuning its parameters to achieve optimal performance.The model then undergoes validation and testing on new, unseen data to ensure its generalization ability.ML models can range from simple linear regression to complex neural networks, and they're widely used across various domains to automate tasks, gain insights from data, and improve decision-making processes.Customized feature extraction methods play a crucial role in achieving accurate intrusion detection while mitigating false alarms.We employed three ML algorithms to determine the most effective optimization approach among the mentioned options.

Logistic regression
Logistic Regression is a binary classification algorithm used to predict the probability of an instance belonging to a certain class.Figure 6 shows the Schematic diagram of logistic regression, it models this probability using the logistic function, which transforms input features through a weighted sum.The model's parameters are learned from training data by minimizing the log loss (cross-entropy) between predicted probabilities and actual class labels.The resulting model can then  make predictions by comparing predicted probabilities to a threshold, typically 0.5.Logistic Regression is widely used for its simplicity, interpretability and effectiveness in various fields where binary classification is required.

Naive Bayes classifier
The Naive Bayes classifier is a probabilistic algorithm used for classification tasks.It's based on Bayes' theorem and the assumption of feature independence, often considered naive but simplifying.Figure 7 shows the Schematic diagram of logistic regression, it calculates the probability of an instance belonging to a particular class given its features.The classifier estimates class probabilities by multiplying conditional probabilities of individual features given the class.Naive Bayes is especially useful for text classification and spam filtering, where it models word frequencies.While the independence assumption might not hold in all cases, Naive Bayes is computationally efficient, interpretable and performs well on certain types of data.

Decision tree classifier
A Decision Tree Classifier is a ML algorithm used for classification tasks.It operates by recursively partitioning the dataset into subsets based on the values of input features, leading to a tree-like structure of decisions and outcomes.Figure 8 shows the Process to implement decision tree for intrusion detection, at each internal node of the tree, a feature is chosen as a split criterion, and the data is divided into branches based on its possible values.This process continues until a stopping condition is met, such as a maximum tree depth or a minimum number of instances per leaf.The leaves of the tree represent the predicted class labels.Decision trees are intuitive, easy to visualize, and can handle both categorical and numerical features.

Ensemble model
An ensemble model is a ML approach that combines the predictions of multiple individual models to improve overall performance and accuracy.Ensemble methods often involve training multiple models with different initializations, subsets of data, or algorithm variations, and then combining their predictions through techniques like averaging, voting, or weighted averaging.Examples of ensemble methods include Random Forests (combining decision trees), Gradient Boosting (iteratively improving weak learners) and AdaBoost (boosting weak learners).Ensemble models are known for their ability to reduce overfitting, enhance generalization and produce more reliable results, making them popular in various ML tasks.

XG boost
Extreme Gradient Boosting (XG Boost) is a member of the boosting algorithm family and is a practical implementation of the gradient boosting approach.In the case of classification tasks, XGBoost constructs numerous trees in an iterative manner, utilizing knowledge from previously developed trees.This learning technique leverages errors from previous trees to enhance accuracy in subsequent iterations.To mitigate bias and the risk of overfitting, XGBoost incorporates the L1 (Least Absolute Shrinkage and Selection Operator) and L2 (Ridge Regression) regularization algorithms.

Random forest
Random Forest is an ensemble classification method comprising a multitude of Decision Tree classifiers.Through the construction of numerous decision trees on the training dataset and employing majority voting, the ultimate class prediction is determined, as depicted in Figure 9. Consequently, it yields enhanced and reliable predictions, leading to improved system performance in accuracy, recall, precision and false alarm rate.

Adaboost
The

Performance parameters
Performance parameters in ML approaches encompass various metrics to assess model effectiveness.These include accuracy, measuring overall correct predictions; precision and recall, evaluating false positives and false negatives; and the F1 score, balancing precision and recall.The performance parameters used by this work are tabulated in Table 6.

Hardware and software setup
The system makes use of an IoT dataset comprising 21 attributes.To ensure consistent computational

Experimental results
Among the evaluated feature selection methods in the table, the HHO algorithm stands out as the most effective for enhancing ML classification models.In direct comparison with alternative approaches, HHO consistently yields superior results.These findings underscore HHO's proficiency in selecting pertinent features that significantly contribute to the model's accuracy and predictive capabilities.This outcome highlights the algorithm's potential for optimizing feature subsets, thereby elevating the overall performance of the classification models.Table 7 shows the results attained by different machine learning models.The outcomes of the predictions indicate that our proposed feature selection method, utilizing the Harris Hawks Optimization algorithm in combination with the random forest classifier, yields the most favourable results when contrasted with alternative approaches.This amalgamation of techniques consistently demonstrates superior performance across various evaluation metrics.The synergy between the Harris Hawks Optimization algorithm and the random forest classifier showcases their collective potential in enhancing predictive accuracy and classification capabilities.These results accentuate the effectiveness of this combined approach in selecting salient features that substantially contribute to the model's robustness and precision.In essence, the study underscores the notable advantages of leveraging the Harris Hawks Optimization algorithm alongside the random forest classifier for optimizing feature selection, ultimately leading to enhanced outcomes in predictive modelling tasks.Table 8 compares the result of random forest with another ensemble model.

Conclusion
The surge in IoT devices underscores the urgency of fortifying the security and integrity of interconnected systems.The exploration of intrusion detection within the IoT landscape reveals the limitations of traditional rule-based systems in tackling the dynamic and diverse nature of threats.This has propelled the integration of ML techniques to bolster detection capabilities.The paper's emphasis on tailored feature extraction techniques and the utilization of diverse ML algorithms highlights the potential for accurate and efficient intrusion detection in IoT environments.The demonstrated success of ensemble methods further accentuates the viability of combining algorithmic strengths for enhanced robustness.The attainment of a remarkable 99.97% accuracy through the fusion of random forest and the Harris-Hawks Optimizer underscores the promising advancements in this domain.This paper underscores the crucial role of ML in countering the evolving challenges of intrusion detection in the intricate and interconnected world of IoT.

Figure 1 .
Figure 1.Daily life application of IoT.

Figure 5 .
Figure 5. Input space to feature space conversion in SVM-PSO using kernel functions.

Figure 8 .
Figure 8. Process to implement decision tree for intrusion detection.

Table 1 .
Merits and demerits of IoT.

Table 2 .
Features in the dataset.

Table 3 .
Features selected by the PSO algorithm.

Table 4 .
Features selected by the WOA algorithm.

Table 5 .
Features selected by the HHO algorithm.

Table 7 .
Comparing the result of optimization algorithm.

Table 8 .
Comparing the result of random forest with another ensemble model.
Note: Bold values indicate proposed value results.