Develop a Hybrid Intelligent Fuzzy Self-Adaptive Mutated Genetic Algorithm and Deep Reinforcement Learning for Efficient Home Energy Management

Develop a Hybrid Intelligent Fuzzy Self-Adaptive Mutated Genetic Algorithm and Deep Reinforcement Learning for Efficient Home Energy Management

Balamurugan Vaithiyanathan* Kavitha Ramaswami Jothi Suresh Kumar Paramasivam

Department of Electronics and Communication Engineering, University College of Engineering, Panruti 607106, India

Department of Civil Engineering, University College of Engineering, Panruti 607106, India

Corresponding Author Email: 
balamuruganavb@gmail.com
Page: 
1131-1148
|
DOI: 
https://doi.org/10.18280/ts.420244
Received: 
18 January 2025
|
Revised: 
13 April 2025
|
Accepted: 
19 April 2025
|
Available online: 
30 April 2025
| Citation

© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

This paper proposes a hybrid strategy that combines Deep Reinforcement Learning (DRL), Self-Adaptive Mutated Genetic Algorithm (SAM-GA) and fuzzy logic to create an effective home energy conservation system. Over the past decade, the demand for energy-efficient systems has highlighted the need for solutions that maximize energy use without compromising quality of life, particularly in residential environments. Traditional energy management techniques often struggle to handle dynamic and unpredictable electricity consumption patterns, leading to inefficient resource use and higher energy costs. The proposed model leverages DRL to improve decision-making through continuous learning, SAM-GA to optimize power asset allocation, and fuzzy logic to manage uncertainties related to electricity demand. The primary objective of this hybrid algorithm is to reduce energy costs and consumption in residential areas by dynamically balancing supply and demand. Experimental results indicate that the algorithm effectively adapts to fluctuating energy demands, achieving a 20% reduction in overall electricity consumption while ensuring the smooth operation of household tasks. This innovative approach not only optimizes energy usage but also provides a robust, adaptable foundation for sustainable home energy management.

Keywords: 

home energy management system (HEMS), fuzzy logic, Self-Adaptive Mutated Genetic Algorithm (SAM-GA), Deep Reinforcement Learning (DRL), smart energy optimization, energy efficiency, real-time energy management, adaptive decision-making

1. Introduction

Around the world, electricity serves as an essential foundation for the operation of numerous enterprises across all sectors. Energy generation is one of the primary drivers of a country's economic growth. In today’s world, people rely on electricity to carry out even routine tasks. The demand for power is rising daily. The traditional centralized power grid system is struggling to meet customers' growing power demands [1]. Because the existing energy system primarily depends on fossil fuels, additional electricity-generating units must be installed to increase energy production. Figure 1 illustrates the framework of the Home Energy Management System (HEMS). Through a combination of management systems, advances in communication technology, and optimization techniques, the HEMS can plan and regulate equipment operations to enhance overall energy efficiency. It also enables consumers to work with grid operators to establish demand-response strategies and consumption plans through bidirectional communication [2]. Electricity production, distribution, and consumption can be visualized as a tree structure, where nuclear power plants or other large energy production units act as root nodes, distribution centers and transformers serve as secondary nodes, and clients function as leaf nodes.

In this unidirectional network, energy flows from the source to the users, who act as reception nodes [3]. One of the primary goals of HEMS is to reduce customers' power bills while meeting their needs and preferences. To achieve this, HEMSs perform two main functions: (1) using smart meters to monitor real-time power consumption, and (2) scheduling household appliances for minimal energy use [4].

A fast-distributed HEMS method for multiple homes was developed using a Mixed-Integer Nonlinear Programming (MINLP) approach with nonconvex relaxation. In another study, a residential energy strategy managed by an Energy Storage System (ESS) was introduced, incorporating novel technologies for Electric Vehicles (EVs) and ESSs [5]. An alternative HEMS method based on predictive control was proposed by forecasting the EV state. To maintain consumer satisfaction during HEMS operations, recent research has proposed various methods. A Quality of Experience (QoE)-aware HEMS technique was developed to continually adjust the QoE threshold, factoring in EVs and renewable energy [6].

HEMS research has increasingly incorporated Machine Learning (ML) techniques, especially data-driven methods, for enhanced energy management. Techniques for accurately forecasting photovoltaic system output for efficient energy management in buildings were explored by Chen et al. [7]. Load prediction using Deep Neural Network (DNN) techniques has also been employed to reduce energy consumption in homes and buildings. Reinforcement Learning (RL) has emerged as a promising ML technique for optimizing energy use in buildings. Google DeepMind’s RL-based energy management system demonstrated a 40% reduction in energy consumption by optimizing data center cooling [8]. In HEMS applications RL techniques, such as Q-learning combined with Artificial Neural Networks (ANN), have been used to solve HEMS challenges, maintain appliance efficiency, estimate consumer comfort, and provide real-time pricing forecasts [9]. A novel demand-management approach for HEMS was proposed by combining fuzzy logic for incentive functions with Q-learning to reduce state-action pairings. A comprehensive Deep RL (DRL) approach was also developed to manage energy consumption in commercial buildings while considering occupant comfort for heating, environmental, and lighting conditions [10].

Indoor positioning technologies have progressed significantly, with methods like wearing badges and systems such as UbiSense. Many indoor positioning systems require frequent calibration, are costly, and rely on additional hardware, limiting their practicality. New indoor positioning systems that are cost-effective, nonintrusive, and hardware-independent are needed. Existing systems commonly use Received Signal Strength Indicator (RSSI) methods with Bayesian filtering, hidden Markov models, and Monte Carlo approaches [11]. Genetic algorithms and clustering techniques have also been explored though they often lack interpretability and struggle with ambiguous data. Several studies have used fuzzy logic to handle uncertainties, producing understandable, effective systems. These systems usually require knowledge of all accessible Access Point (AP) locations, which is impractical in most settings [12]. Conversely, price-based programs offer an indirect way to manage customer loads. These programs provide consumers with time-varying rates based on the cost of power at different times, encouraging reduced electricity usage during peak hours. Time-of-Use (TOU) pricing is simple for consumers to understand, allowing them to shift power use to lower-rate periods and thus spread consumption throughout the day to avoid high costs [13].

1.1 Problem statement

The growing demand for energy in homes, coupled with the need for economical and sustainable usage, presents significant challenges for effective home energy management. Traditional energy management techniques often fail to adapt to the dynamic nature of household energy consumption, leading to inefficiency, increased costs, and suboptimal resource allocation. These methods struggle to handle fluctuating energy demands and uncertainties in customer behavior, resulting in either excessive energy use or inadequate supply during peak hours. To address these challenges, this study proposes a hybrid approach that continuously optimizes energy consumption in real-time by integrating fuzzy logic with DRL and SAM-PSO. This approach aims to reduce energy costs, improve efficiency, and promote environmentally friendly energy usage in home settings by intelligently balancing load, demand, and availability while adapting to household schedules.

Figure 1. The structure of HEMS

Figure 2. Structure of energy sharing community

In this energy-sharing community framework, smart homes equipped with ESS can sell excess electricity to a shared energy pool and purchase power based on existing energy prices. If the power pool lacks sufficient energy, non-smart homes can purchase electricity from the wholesale market at lower rates than those in the pool. Non-smart homes can also install Distributed Generators (DGs), like solar panels, to sell surplus energy to the power pool at rates above the Feed-in Tariff (FIT). Each distributed entity can decide whether to buy from or sell to the pool, and Figure 2 illustrates this community energy-sharing framework. This design gives smart homes the flexibility to earn additional benefits by selling power beyond the standard FIT, while less advanced consumers can participate in the Peer-To-Peer (P2P) program and enjoy lower rates compared to retail electricity prices.

1.2 Motivation

The motivation for this research stems from the pressing need to make home energy management smarter, more efficient, and environmentally sustainable. With increasing global energy demands and rising costs, households are under pressure to reduce their energy consumption without compromising the comfort and quality of daily living. Additionally, as more homes incorporate renewable energy sources, such as solar panels and battery storage systems, there is a growing need for systems that can intelligently manage these resources to maximize their benefits. Traditional energy management approaches often lack the adaptability and precision to handle the complexities of modern household energy demands, especially under varying conditions and unpredictable human behaviour. By leveraging advanced optimization techniques like SAM-PSO, fuzzy logic, and DRL, this research aims to create a solution that dynamically adjusts to energy usage patterns, minimizes costs, and promotes sustainability. Ultimately, this work aspires to make home energy management a key component of the global shift toward more energy-conscious and sustainable living environments.

1.2.1 Primary contributions

Hybrid Model for Energy Management: This research introduces a novel hybrid framework that combines fuzzy logic, Self-Adaptive Mutated Particle Swarm Optimization (SAM-PSO), and Deep Reinforcement Learning (DRL) to address the complex and dynamic nature of home energy management. By integrating these three methods, the model adapts to varying energy demands and user behaviours with high precision.

Adaptive Real-Time Optimization: The proposed model leverages SAM-PSO to dynamically optimize energy allocation and consumption in real-time. This adaptive optimization allows for more efficient distribution of energy resources, especially in fluctuating household environments, and supports sustainable energy management practices by minimizing waste.

Enhanced Handling of Uncertainties in Energy Demand: By incorporating fuzzy logic, the model addresses the inherent uncertainties in household energy usage, effectively adapting to unpredictable shifts in demand. This flexible, rule-based approach is particularly beneficial in residential settings where usage patterns can vary significantly throughout the day.

Improved Integration of Renewable Energy: The hybrid model is designed to optimize not only traditional energy sources but also renewable energy generation and consumption. This contribution helps households maximize the use of renewable resources, reducing dependency on non-renewable energy and lowering energy costs.

High Accuracy in Energy Forecasting and Cost Reduction: Experimental results demonstrate the model’s effectiveness, achieving high accuracy rates in forecasting energy demand and optimizing energy costs. The DRL component enables continuous learning and adaptation, ensuring sustained performance improvements and cost savings over time.

Potential for Application in Smart Home Systems: This paper highlights the practical applications of the proposed model in smart home ecosystems, offering a scalable and intelligent energy management solution that can be integrated with existing smart home devices and IoT-based energy monitoring systems.

By addressing gaps in existing energy management solutions, this paper provides a comprehensive approach that not only optimizes energy usage but also enhances sustainability, making it a significant advancement in the field of smart home energy management.

2. Related Works

Using power storage mechanisms, created a revolutionary approach for residential area energy administration as a crucial demand-side management mechanism. A dynamic soft constraint approach was proposed to allow thermostatically regulated devices to plan their operations in both typical and unusual circumstances [14]. Using modeled load designs, proposed an Intelligent Appliance Control (IAC) method to track and manage these power-intensive equipment daily operations. Deep learning with reinforcement learning was used to manage electric equipment with respect to its cost of displeasing [15].

By combining the environment for simulation with a system for machine learning and battery storage for electricity, DRL was used in HEMS. The DRL algorithm's efficacy was confirmed by simulations. Proposed employing Deep Q-learning for microgrid capacity planning. DRL has been applied to refer to the Internet of Things (IoT) and intelligent cities [16]. Batch RL was proposed as a way to best plan the functioning of a device for storage in energy administration because of the importance of storage devices in future microgrids. The potential of several deep learning approaches was examined to extract pertinent characteristics and DRL was proposed as a means of scheduling the loads that were thermostatically managed [17]. Proposed using multi-agent reinforcement learning in order to enhance neighborhood energy sharing. DRL was used to address the construction energy-efficient optimum control issue; two deep reins In the meanwhile, certain energy firms in the US, EU, and Asia have implemented Time-of-Use (TOU) rates. Numerous studies on DSM that contain TOU tariffs have been conducted [18].

Integrating DSM and TOU tariffs might lower the expenses and pollution of power systems at high levels of renewable integration while also greatly enhancing system operating reliability. The best rewards for a combination of TOU and EDRP schemes were identified after the Demand Response (DR) model was developed taking into account both TOU and Emergency Demand Response Program (EDRP) approaches [19]. It is clear from the previously mentioned study that the TOU tariff is advantageous to both network service providers and electrical consumers, indicating that the pricing mechanism will have a direct positive impact on network performance and electrical usage habits. Users can exchange excess energy thanks to the P2P energy trading technology [20]. The findings demonstrated that the proposed solution may considerably lessen the effect that charging has on the power grid during peak hours. For societies with peer-to-peer commerce capabilities created a MultiAgent System (MAS) that utilized a day-ahead management algorithms. This system would allow houses to respond to changes in their surroundings and engage in trade with other agencies. P2P energy trading's potential to lower customer energy bills and boost DER providers' revenue [21].

The market players would have to choose how much energy to purchase or sell and when to do so in order to engage in peer-to-peer trade. Promoting P2P trading for domestic consumers would be hampered by the process of deciding complexity and significant processing overhead. In order to support the P2P trading system, it is crucial to investigate effective methods of regulating the DERs for consumers at home [22]. The highest degrees of motion confusion, behavioral uncertainty and subject-specific uncertainties such as location, orientation, and speed make it extremely difficult to achieve reliable behavior and activity detection in real-world settings. When a person engages in the same activity category more than once, their behavior is not distinctive [23]. As a result, there are significant differences in behavioral traits across and among subjects leads to a great deal of ambiguity and confusion in behavior recognition [24]. Some earlier methods use fuzzy logic and computer vision to identify behavioral pattern descriptions that have been extracted. Fuzzy logic has shown to be an effective technique in this sector for identifying human actions and handling ambiguity. Fuzzy logic was used to identify student behavior to assess how well it performed in a laboratory control course [25]. The majority of these methods make use of intricate feature models, which raises the difficulty of building the fuzzy logic system. Proposed approach improves identification speed and provides a more flexible depiction of human activity by utilizing fuzzy logic and a simpler feature model [26].

2.1 Research gap

The existing limitations in technologies to dynamically adapt to shifting household energy needs and unpredictable usage patterns represent a significant gap in home energy management research. Existing methodologies including rule-based and static optimization methods, often overlook the complexities of real-time fluctuations and the inherent unpredictability in household energy use. While some newer models incorporate optimization or machine learning, they typically focus on a single approach, limiting their adaptability and effectiveness in handling diverse, dynamic energy scenarios. These models often lack the mechanisms to effectively integrate storage systems with renewable energy sources resulting in suboptimal energy distribution and increased costs. Existing research also falls short in providing a comprehensive solution that considers the real-time responsiveness and computational efficiency required for practical smart home applications. Few studies have explored hybrid methods that integrate fuzzy logic, adaptive particle swarm optimization, and reinforcement learning to fully address the challenges of home energy management, despite advancements in deep learning and optimization. Study aims to address this gap by developing a robust, hybrid model that incorporates these advanced methods, enhancing real-time energy control, cost efficiency, and sustainability in residential environments.

3. Materials and Methods

To improve the utilization of energy in residential settings, this study focuses on creating a combination approach that integrates fuzzy logic, DRL and SAM-GA shown in Figure 3. The hybrid method offers an adaptable structure for managing different levels of power consumption and demand habits by utilizing fuzzy logic to handle uncertainty in consumption trends. The model's optimization abilities are improved by the incorporation of SAM-GA dynamically allocates resources in real time to efficiently balance energy supply and demand. The framework is further strengthened by DRL allows it to learn and adjust to shifting energy patterns over time, improving choices to accomplish efficient energy administration. The goals of this hybrid system are to save expenses, increase home energy efficiency, and facilitate the use of energy from renewable sources. The approach seeks to make home energy use more economical and ecological by adjusting to the particular needs of every household and reacting instantly, supporting the more general objectives of sustainable development and energy saving.

Figure 3. Proposed architecture

3.1 Problem formulation

In a nutshell, HEMS is a mathematical optimization problem with complicated changes in the environment, involving a range of devices with unique properties. To maximize the equipment's DR capability and cost-effectiveness, proper management is essential. In Figure 4, four-day consumption of electricity examples are chosen at random. The power utilization time series of an electric automobile, dishwasher, air conditioning, and heaters are displayed in various color curves as the four primary adjustable elements. Overall home energy consumption, total number of programmable and non-controllable loads, is shown by the blue lines. The green color curve represents solar PV generation. The energy consumption of the entire family, solar photovoltaic, cooling systems, and electric automobiles that have a bigger influence on consumers' energy consumption are chosen in this instance as seen in Figure 4. Over the course of a year, one house's worth of data is gathered every 15 minutes for the method's development. The local power operator chooses the energy pricing information, which includes the PV on-grid price and the time-of-use plan. The required simplified changes were done in accordance with the algorithm's requirements. One might consider the power price structure mentioned above to be fixed. The devices should be grouped together since they operate in distinct ways. The electrical devices owned by the occupants may be categorized into three groups based on their physical makeup and consumption patterns:

1) Base load may have a fixed demand for power use because it cannot be reduced or changed.

2) Time-shift load comprises appliances such as dishwashers and washing machines has two states: open and closed; the duration of operation is adjustable.

3) Power-shift loads such as air conditioners can be variable within a specified consumption period.

The problem formulation in this paper focuses on optimizing home energy management by minimizing energy costs and maximizing the use of renewable resources, while balancing energy demand and maintaining user comfort. This is achieved through a hybrid model combining Fuzzy Logic, SAM-GA and DRL.

3.2 Objective function

The primary objective is to minimize the Total Energy Cost (TEC) while ensuring efficient use of Renewable Energy (RE) and maintaining User Comfort (UC). The objective function Y can be formulated as:

$Y=\min \left(\sum_{t=1}^T P_t \cdot E_t+\alpha \cdot\binom{E_t^{\text {non-renewable }}}{-E_t^{\text {renewable }}}\right)-\beta \cdot U C$     (1)

where, $T$ is the total time horizon. $P_t$ is the price rate of energy per unit (in $\$ / \mathrm{kWh}$) at time $t$. $E_t$ is the total energy consumption at time $t$. $E_t^{\text {non-renewable }}$ and $E_t^{\text {renewable }}$ represent the non-renewable and renewable energy consumption at time $t$, respectively. $\alpha$ is a weight parameter for renewable energy prioritization. $\beta$ is a weight parameter for user comfort considerations.

Constraints

Energy Balance Constraint: The energy demand must be met by either renewable or non-renewable sources at each time step t:

$E_t=E_t^{{non-renewable }}-E_t^{ {renewable }}$     (2)

Renewable Energy Usage Constraint: Renewable energy use is limited by the amount available, denoted as $E_t^{R E \_a v a i l a b l e}$, which varies with time t:

$E_t^{ {renewable }} \leq E_t^{ {RE\_{available} }}$   (3)

User Comfort Constraint: It is measured based on certain comfort parameters such as temperature and lighting levels, which must remain within acceptable ranges:

$U C=f u z z y\left(T_{ {set }}, T_{ {actual }}\right)+f u z z y\left(L_{ {set }}, L_{{actual }}\right)$     (4)

where, $T_{ {set }}$ and $T_{ {actual }}$ are the set and actual temperatures. $L_{ {set }}$ and $L_{{actual }}$ are the set and actual lighting levels. The fuzzy function adjusts comfort scores based on deviations from the desired settings.

The optimization strategy applies SAM-GA to find the optimal allocation of energy consumption between renewable and non-renewable sources, while the DRL model adjusts energy usage patterns over time for further optimization. The cost function with these constraints ensures efficient energy allocation and minimizes the total cost while considering renewable sources and user comfort.

For each particle x in SAM-GA, the position update (energy allocation) is given by:

$i_x(t+1)=i_x(t)+v_x(t+1)+mutation \left(i_x\right)$     (5)

where, $i_x(\mathrm{t})$ is the existing position (energy allocation) of particle $x$. $v_x(t+1)$ is the velocity update, calculated based on particle's best position and global best. $mutation\left(i_x\right)$ represents the adaptive mutation mechanism to escape local minima.

The DRL model's reward R is designed to maximize energy efficiency and minimize costs:

$R=-Y$     (6)

The optimization process iteratively improves Y by minimizing costs and maintaining user comfort and renewable energy priorities.

3.3 Dataset description

A range of characteristics necessary for effective home energy management and the development of models are captured in the dataset utilized in this study.

It contains information particular to a home including a date and an individual identification number for every record, which aids in monitoring trends in energy usage over time. The quantity of energy generated (kWh) and consumed (kWh) are important energy indicators that show how dependent each home is on both existing and renewable sources of energy. In Further optimization is made possible by the detailed insights into energy use by particular devices that appliance consumption information offers. By taking into account economic considerations, the pricing rate and total energy cost per kWh enable the model to assist consumers in lowering their energy expenses. Additional context for comprehending demand changes is provided by atmospheric conditions. The energy-saving measures and improvements implemented are documented by model-specific information, such as modifications performed by the DRL model and optimization levels attained using SAM-GA. Household features, surroundings, energy use, expenses, and model-specific improvements for a hybrid method employing SAM-GA and DRL are among the important aspects of the information used for home energy conservation that is captured Table 1. Table 2 sample data provide a snapshot of parameters such as timestamps, activity kinds, conditions in the environment, household-specific IDs, and consumption statistics. To help the model determine areas for optimization, the Appliance Usage column provides a list of the power utilization by different devices at the specified date. Modifications and effectiveness improvements made possible by DRL and SAM-GA are shown by model-driven columns such as DRL Action (kWh) and SAM-GA Optimization (%) A solid basis for dynamic, responsive energy utilization assessment in residential settings is provided by this organized information.

Table 1. Dataset description

Attribute

Description

Data Type

Example Values

Household

Unique identifier for each household

Integer

101,102,103

Timestamp

Data and time of energy consumption record

Data Time

2024-01-01 12:00:00

Energy consumption(kWh)

Amount of energy consumed by the household

Float

1,25,2.48,3.76

Temperature (*C)

Recorded outdoor temperature at the time of data collection

Float

18.5,22.3,15.0

Humidity (%)

Humidity level associated with energy usage

Float

45.2,60.1,50.3

Renewable Energy(kWh)

Amount of renewable energy generated (e.g. solar, wind)

Float

0.85,1012,0.00

Peak Load Indicator

Indicates if data was collected during a pack usage period

Boolean

True, False

Activity Type

Type of household activity affecting energy consumption

Categorical

Cooking, HVAC, Lighting

Appliance Usage

Power usage per appliance in the household

Float Array

[0.3,0.5,0.8]

Price rate (S/kWh)

Cost of energy per kilowatt hour at the timestamp

Float

0.12,0.15,0.18

Energy Cost ($)

Total cost of energy used at the timestamp

Float

0.15,0.30,0.45

Weather Conditions

Weather conditions impacting energy consumption

Categorical

Sunny, cloudy, Rainy

DRL Action (kWh)

Energy adjustment made by the Deep Reinforcement Learning model

Float

-0.2,0.0,0.3

SAM-GA Optimization (%)

Optimized percentage of energy usage through SAM-PSO

Float

5.0,10.0,7.5

Table 2. Sample data

Household ID

Timestamp

Energy Consumption (kWh)

Temperature (℃)

Humidity (%)

Renewable Energy (kWh)

Peak Load Indicator

101

2024-01-01 2:00:00

2.25

19.5

46.2

0.86

False

102

2024-01-01 2:30:00

3.48

23.3

61.1

2.12

True

103

2024-01-01 3:00:00

4.76

16.0

51.3

0.01

False

104

2024-01-01 3:30:00

2.95

21.0

56.0

0.76

True

105

2024-01-01 4:00:00

3.10

20.1

59.3

0.51

False

Household ID

Activity Type

Appliance Usage (kWh)

Price Rate ($/kWh)

Energy Cost ($)

Weather Conditions

DRL Action (kWh)

SAM-PSO Optimization (%)

101

HVAC

[0.4,0.6,0.5]

0.13

0.16

Sunny

-0.3

6.0

102

Cooking

[0.9,0.7,0.6]

.0.16

0.31

Cloudy

0.1

11.0

103

Lighting

[0.6,0.4,0.3]

.0.19

0.46

Rainy

0.4

8.5

104

Washing Machine

[2.0,0.8,0.6]

0.15

0.28

Cloudy

-0.2

9.3

105

Entertainment

[0.5,0.6,0.7]

0.14

0.21

Sunny

0.2

7.0

3.4 Deep reinforcement learning for efficient HEM

Customer's chosen equipment schedule and comfortable assortment, provide a hierarchical two-level DRL approach in this part of the paper. The structure uses the actor-critic technique shown in Figure 5. The proposed model used online in a real-world setting once it has been thoroughly trained using the offline database. The structure is divided into two sections: training and validaiton. The primary focus of this study is the validation and training component. To make the best decision in a real-world physical setting, the training phase supervises both the information acquisition and implementation phases. As mentioned above, agent, environment, reward, and action are the four basic parts in DRL. In addition, the details of the algorithm implementation would be explained according to these four parts. The general architecture is shown in Figure 6.

Figure 5. DRL in the HEM framework

Figure 6. DRL in the HEM framework in detail

3.4.1 Energy management model for home appliances

The Energy Management Model for a home that includes a Washing Machine (WM), Air Conditioner (AC), Photovoltaic (PV) system, Electric Vehicle (EV), Swinging Machine, and Camera can be formulated as a DRL problem.

State Space (S): It represents the existing status of the appliances, energy generation, and external conditions, it can be defined as follows:

$\begin{aligned} & S_t=\left\{\begin{array}{c}T_{ {actual }}(t), T_{ {set }}^{A C}, P_t, E_{P V}(t), E_{E V}(t), \\ E_{W M}(t), E_{A C}(t), E_{ {Swinging\ Machine}}(t), \\ {Operstional\ Status}_{A C}, {Operstional\ Status}_{{EV}}, \\ E_{W M},{Operstional\ Status }_{{Swinging\ Machine}},\end{array}\right\}\end{aligned}$    (7)

where, $T_{\text {actual }}(t)$: existing indoor temperature at time f. $T_{s e t}^{A C}$: User-set temperature for the AC. $P_t$: Energy price rate at time t (in $\$ / \mathrm{kWh}$). $E_{P V}(t)$: Energy produced by the PV system at time t (in kWh$). E_{E V}(t)$: Energy required for EV charging at time if (in kWh ).

$E_{W M}(t)$: Energy consumed by the washing machine (kWh). $E_{\text {Swinging Machine }}(t)$: Energy consumed by the Swinging Machine (kWh). $E_{A C}(t)$: Energy consumed by the AC (kWh). Operstional Status ${ }_{ {Swinging\ Machine }}$ Binary variable indicating if WM is on (1) or off (0) at time t. $Operstional\  Status\_{A C}$, Binary variable indicating if AC is on (1) or off (0) at time t. $Operstional\  Status\_{A C}$ Binary variable Indicating if EV is charging (1) or off (0) at time $t$. $E_{W M}$ Binary variable indicating if the dishwasher is on (1) or off (0) at time $t$.

Action Space (A): It consists of the actions that the agent can take to manage the operation of the appliances and the energy flow in the system. It is defined as:

$A_t=\left\{a_{W M}, a_{A C}, a_{E V}, a_{{Swinging\ Machine }}\right\}$     (8)

where, $a_{W M}$ : Action for the washing machine ( 1 for on, 0 for off). $a_{A C}$: Action for the air conditioner ( 1 for on, 0 for off). $a_{E V}$: Action for the electric vehicle ( 1 for charging, 0 for not charging). $a_{\text {Swinging Machine}}$: Action for the Swinging Machine ( 1 for on, 0 for off).

Reward Function (R): Incentivizes actions that optimize energy usage while minimizing costs and maintaining comfort. The reward at each time step 1 can be formulated as:

$R_t=-\left(P_t \cdot\left(\begin{array}{c}E_{W M}(t)+E_{A C}(t)+E_{E V}(t) \\ +E_{\text {Swinging Machine }}(t) \\ +\alpha\left|T_{\text {actual }}(t)-T_{\text {set }}^{A C}\right|+\beta \cdot E_{P V}(t)\end{array}\right)\right)$     (9)

where, $P_t \cdot\left(E_{W M}(t)+E_{A C}(t)+E_{E V}(t)+\right.\left.E_{\text {Swinging Machine }}(t)\right)$: Total cost incurred for operating all appliances at time t. $\alpha$: Weight factor for penalizing deviations from the desired indoor temperature (for AC ). $\mid T_{\text {actual }}(t)-$ $T_{\text {set }}^{A C} \mid$: Absolute difference between actual indoor temperature and setpoint. $\beta$: Weight factor that rewards the use of renewable energy from the PV system. $E_{P V}(t)$: Amount of energy produced by the PV system at time $t$.

Washing Machine Energy Consumption:

$E_{W M}(t)=P_{W M} \cdot a_{W M}$     (10)

where, PWM is the power consumption of the washing machine when operating (kWh).

Air Conditioner Energy Consumption:

$\begin{gathered}E_{A C}(t)  =f\left(\left|T_{\text {actual }}(t)-T_{\text {set }}^{A C}\right|\right) \cdot a_{A C} \cdot P_{A C}\end{gathered}$     (11)

where, $P_{A C}$ is the power consumption of the $\mathrm{AC}(\mathrm{kWh})$, and ${f}(.)$ is a function that increases energy usage with greater temperature deviation.

$E_{E V}(t)=P_{E V} \cdot a_{E V}$     (12)

where, $P_{E V}$ is the power required for charging the EV (kWh).

Swinging Machine Consumption:

$\begin{aligned} & E_{ {Swinging_Machine }}(t)=P_{{Swinging_Machine }} \cdot a_{ {Swinging_Machine }}\end{aligned}$     (13)

where, $P_{ {Swinging\_Machine }}$ is the power required for charging the EV (kWh).

3.4.2 Final objective function for the agent

The goal of the DRL agent is to maximize the cumulative reward R over a given period T:

$\begin{gathered}-\sum_{t=1}^T\binom{\text { Maximize } \sum_{t=1}^T R_t=}{P_t \cdot\left(\begin{array}{c}E_{W M}(t)+E_{A C}(t)+E_{E V}(t) \\ +E_{ {Swinging \ Machine }}(t) \\ +\alpha\left|T_{{actual }}(t)-T_{{set }}^{A C}\right|+\beta \cdot E_{P V}(t)\end{array}\right)}\end{gathered}$     (14)

DRL agent to efficiently manage the operation of the appliances, leveraging the PV system to minimize costs while ensuring user comfort and meeting energy demands. The agent learns to optimize the timing and operation of appliances based on real-time energy prices and user preferences. A developing trend and a load-transferable, controlled load is household storage of electricity. The following describes the connection between the storage of energy and the State of Charge (SOC) using charging and discharging powers:

$\begin{gathered}\operatorname{SOC}(t+1)=\operatorname{SOC}(t)+ a_{t, \text { cha }} \frac{\eta^{c h} \Delta t}{Q_{ {storage }}} P_{ {storage }}^{c h}(t)+ a_{t, \text { disc }} \frac{\Delta t}{\eta^{{disc }} Q_{ {Storage }}} P_{ {storage }}^{ {disc }}(t)\end{gathered}$     (15)

where, $P_{ {storage }}^{c h}(\mathrm{t})$ the charging power and $P_{ {storage }}^{\text {disc }}(t)$ discharging power.

3.5 Intelligent fuzzy self-adaptive mutated genetic algorithm for efficient HEM

Fuzzy rules are commonly described as triangular or trapezoid-shaped curves. Purpose of these guidelines is to convey the degree of certain characteristics. The employment of Q-learning in the ongoing trading procedure is made possible by the fuzzy deduction system may also produce a decent approximation of the process of trading shown in Figures 7(a) and 7(b).

(a)

(b)

Figure 7. Fuzzy rules

The potential model with the highest output degrees is then chosen in order to categorize and identify what happens in the present time frame. The fuzzy rule-based system for energy management in home appliances aims to prioritize and control each device's operation based on factors such as energy demand, availability of renewable energy, user preferences and appliance priority. Below are sample fuzzy rules for appliances such as Washing Machine (WM), Air Conditioner (AC), Photovoltaic (PV) system, Electric Vehicle (EV) charging, and Swinging Machine. Each rule is formed by considering fuzzy input variables and generating corresponding output actions.

In proposed hybrid model, fuzzy logic is used to handle the inherent uncertainty in household energy consumption patterns, such as irregular appliance usage, varying user preferences, and fluctuating energy tariffs. The fuzzy rules designed for appliance control are not arbitrarily defined—they are derived using both domain expert knowledge and data-driven analysis, ensuring better adaptability and generalization across different households.

Expert Knowledge-Based Rule Design

Conducted interviews and consultations with home energy experts, electrical engineers, and smart appliance manufacturers to construct a base set of fuzzy rules. For example:

  • If usage priority is high and energy cost is low, then appliance should be ON.
  • If room occupancy is low and device energy rating is high, then appliance should be OFF.

These rules reflect common-sense heuristics used in manual energy-saving decisions and are applicable to a wide range of domestic environments.

Data-Driven Refinement

To enhance generalizability, we implemented data clustering techniques (e.g., fuzzy c-means, k-means) on historical usage data collected from smart homes. These clusters revealed patterns that helped fine-tune fuzzy membership functions and rule thresholds. For example:

  • Frequency of usage during different hours helped define "high usage time."
  • Correlations between appliance runtime and occupancy patterns refined the control rules.

Adaptive Rule Updating

Incorporated a feedback-based updating mechanism, where the fuzzy rule base evolves over time using feedback from the DRL policy updates. This makes the fuzzy system self-improving, avoiding the rigidity often associated with static rule sets.

The combination of expert-derived initial rules, data-driven calibration, and reinforcement-based dynamic updates ensures that our fuzzy control system is both robust and adaptable, making it suitable for varied real-world scenarios.

Fuzzy Variables

Inputs:

Energy Demand (ED): Represents the existing household energy demand (Low, Medium, High).

Renewable Availability (RA): Availability of energy from PV (Low, Medium, High).

User Preference (UP): User's priority setting for comfort vs. energy saving (Low, Medium, High).

Appliance Priority (AP): Priority level of each appliance (Low, Medium, High).

Outputs:

Appliance Operation Level (AOL): Output decision level for appliance operation (Off, Standby, On).

Fuzzy Rules

Here is a selection of sample fuzzy rules for each appliance:

Rule Structure

Each rule can be written in the format: IF Condition_1 AND Condition_2 AND... THEN Output.

Rules for Washing Machine (WM)

1. IF ED is High AND RA is Low AND AP is Low THEN AOL for WM is Off.

2. IF ED is Low AND RA is High AND AP is High THEN AOL for WM is On.

3. IF ED is Medium AND RA is Medium AND AP is Medium THEN AOL for WM is Standby.

Rules for Air Conditioner (AC)

1. IF ED is High AND UP is Low AND RA is Medium THEN AOL for AC is Standby.

2. IF ED is Low AND UP is High AND RA is High THEN AOL for AC is On.

3. IF ED is Medium AND UP is Medium AND RA is Low THEN AOL for AC is Off.

Rules for Photovoltaic System (PV)

1. IF RA is High THEN PV Operation Level is On.

2. IF RA is Medium THEN PV Operation Level is Standby.

3. IF RA is Low THEN PV Operation Level is Off.

Rules for Electric Vehicle (EV) Charging

1. IF ED is Low AND RA is High AND AP is High THEN AOL for EV Charging is On.

2. IF ED is Medium AND RA is Medium THEN AOL for EV Charging is Standby.

3. IF ED is High AND RA is Low AND UP is Low THEN AOL for EV Charging is Off.

Rules for Swinging Machine

1. IF ED is Low AND RA is Medium AND AP is High THEN AOL for Dishwasher is On.

2. IF ED is Medium AND RA is Low THEN AOL for Dishwasher is Standby.

3. IF ED is High AND RA is Low THEN AOL for Dishwasher is Off.

Figure 8. IF SAM-GA-fuzzy inference model

For each appliance, mathematically represent the fuzzy inference model for determining the Appliance Operation Level $A O L_{\text {appliance }}$ based on the inputs ED, RA, UP, and AP. Using fuzzy membership functions $\mu_{E D}(i), \mu_{R A}(j), \mu_{U P}(k)$, and $\mu_{A P}(l)$, the fuzzy rule output for each rule $R_x$ is given as:

$\begin{gathered}R_x=\min \left(\mu_{E D}(i), \mu_{R A}(j), \mu_{U P}(k), \mu_{A P}(l)\right) \rightarrow A O L_{\text {appliance }}\end{gathered}$     (16)

The final output for each appliance is computed using the Center of Gravity (COG) defuzzification method for each appliance's aggregated output. This output will define the optimal operational level of each appliance to ensure efficient energy management as shown in Figure 8.

The aggregated operation level for an appliance, using defuzzification, is given by:

$A O L_{ {aggregated }}=\frac{\sum_x R_x \cdot A O L_x}{\sum_x R_x}$     (17)

where, $A O L_x$ represents each rule's consequent (e.g., Off, Standby, On) converted to a numerical equivalent (e.g., Off=0, Standby=0.5, On=1), and $\mathrm{R}_x$ represents the strength of each rule. This output will define the optimal operational level of each appliance to ensure efficient energy management. These fuzzy rules provide adaptive and flexible energy management by taking into account real-time energy demand, renewable availability and user preferences. The fuzzy inference system combined with rules for each appliance helps optimize energy consumption, ensuring efficient management and enhanced user comfort. The Intelligent Fuzzy SAM-GA is used to try to address the home power load scheduling issue with India. To encourage the development of essential genes in future generations and so increase the fitness of chromosomes, the proposed IFRG method was developed with the premise that critical genes must be conserved. This is because genes are the fundamental structural components of chromosomes, and certain genes within a chromosome convey greater data that is relevant to the issue than others. It may be instances in which some of the crucial genes are damaged during the crossover vehicle and mutation processes.

3.6 Algorithm: Hybrid IF-SAMGA and DRL for efficient HEM

Step 1: Initialization

Define State Space S: States include: Appliance operational statuses; Renewable energy availability; User preferences; Energy demand levels.

Define Action Space A: Actions include turning appliances on/off, adjusting operational levels, and scheduling appliances based on fuzzy rules and optimization strategies.

Initialize Parameters: Population size P for the genetic algorithm. Crossover probability pe and mutation probability Pm. DRL model parameters (e.g., learning rate, discount factor).

Define Fuzzy Rules for each appliance based on energy demand, renewable energy availability, and user preferences.

Step 2: Fuzzy Logic System

Input Fuzzy Variables: Obtain real-time data for: Energy demand (ED); Renewable availability (RA); User preference (UP); Appliance priority (AP).

Apply Fuzzy Rules: Use fuzzy inference to determine the Appliance Operation Level $AO L_{\text{appliance }}$ for each device.

Defuzzification: Calculate the aggregated Appliance Operation Level $A O L_{\text {aggregated }}$ using the center of gravity (COG) method.

$A O L_{ \text{aggregated }}=\frac{\sum_x R_x \cdot A O L_x}{\sum_x R_x}$     (18)

where, $R_x$ represents the rule strength and $A O L_x$ is the operational level for each rule.

Step 3: Self-Adaptive Mutated Genetic Algorithm (SAMGA) Optimization

Population Initialization: Generate an initial population P of solutions where each individual represents a possible configuration of appliance operation levels.

Fitness Evaluation: Define the fitness function f(i) to minimize total energy consumption while maintaining user preferences and appliance priorities.

$\begin{gathered}f(i)=\sum_{x=1}^N\left(E_x \times A O L_x\right)+\alpha \cdot P_{ {user\_stasification }} +\beta \cdot R_{ {renewable\ usage }}\end{gathered}$     (19)

where, $E_x$ is the energy consumption of appliance x. $P_{{user\_stasification}}$ is a penalty term based on user preference satisfaction. $R_{{renewable\_usage}}$ is a reward term for renewable energy utilization. $\alpha$ and $\beta$ are weight parameters.

Selection: Use roulette wheel or tournament selection to select individuals for reproduction.

Crossover and Mutation: Apply crossover with probability Pe and mutation with probability Pm. Self-adaptive mutation is used to adjust mutation probability based on convergence, encouraging diversity.

Self-Adaptive Mutation Probability: Adjust Pm based on fitness variance in the population:

$p_m=\frac{\sigma f}{\mu f}$     (20)

where, $\sigma f$ is the standard deviation of fitness and $\mu f$ is the mean fitness.

Update Population: Replace the worst-performing individuals with newly generated offspring.

Convergence Check: Repeat steps until convergence criteria are met (e.g., minimal change in fitness over iterations).

Step 4: DRL for Real-Time Decision-Making

Initialize DRL Agent: Define agent parameters for learning optimal policy $\pi$. Discount factor $\gamma$. Learning rate $\alpha$.

Policy Update: For each time step t: Observe the existing state $S_t$. Select an action $A_t$ based on the existing policy $\pi$ (e.g., e-greedy).

Environment Interaction: Execute action $A_t$ transition to the next state $S_{t+1}$, and observe reward $R_t$.

Reward Function: Define a reward function $R$ that balances energy efficiency and user satisfaction:

$\begin{gathered}R=-\left(\sum_{x=1}^N E_x \times A O L_x\right)+\lambda \cdot P_{\text {user}_{ {satisfaction }}} +\delta \cdot P_{ {renewable\_usage }}\end{gathered}$     (21)

where, $\lambda$ and $\delta$ are weighting factors.

Q-Value Update: Update the Q-values based on the observed reward:

$\begin{gathered}Q\left(S_t, A_t\right) \leftarrow Q\left(S_t, A_t\right)+a\left[R_t+\gamma \max _a Q\left(S_{t+1}, a\right)\right.\left.-Q\left(S_t, A_t\right)\right]\end{gathered}$     (22)

Policy Optimization: Use the learned Q-values to improve the policy $\pi$, guiding future action selection to minimize energy use and maximize user satisfaction.

Step 5: Hybrid Decision-Making

Combine SAMGA and DRL Decisions: Use SAMGA for periodic optimization of appliance settings based on historical data. Apply DRL for real-time adjustments based on immediate observations and rewards.

Final Decision Output: The final operational level $A O L_{\mathrm{final}}$ for each appliance is determined as:

$\begin{gathered}A O L_{{final}}= { SAMGA\ decision } \times D R L\  {adjustment \ factor }\end{gathered}$     (23)

Execute Optimal Action: Implement the operational levels for each appliance in the smart home.

Step 6: Continuous Learning and Adaptation

Retrain DRL and SAMGA periodically based on new energy usage data and appliance patterns.

Adjust Fuzzy Rules as user preferences and renewable availability evolve.

This hybrid algorithm combines the benefits of fuzzy logic for initial decision-making, SAMGA for exploring optimal solutions, and DRL for real-time adjustments. Together, it create a robust system for managing energy in smart homes, maximizing efficiency, and adapting to user behavior and renewable energy availability.

3.7 Experimental setup

To prove the efficacy of the proposed approach was successfully accomplished using information that was simulated. The outcomes of this simulation for ADL identification are really encouraging. Outcomes for a stream of simulated information are displayed in Figure 9.

The proposed system by considering the rules utilized in this system for fuzzy inference was the focus of this first research. To acquire the missing detection, the used technique involved running many tests with various combination rules. Based on the findings, one rule was added to the chosen set of rules. This approach yields good ADL output (about 97% of successful ADL detection). Although this simulation is still in its early stages, it shows how commonplace, low-tech sensor devices may be utilized to identify everyday activities in actual houses. Retrofitting the system into an existing home setting is simple and requires no significant damage or alterations.

Figure 9. ADL recognition experiment for a simulated stream data

4. Results and Discussions

For the experimental investigation, a smart house with three distinct sets of domestic power loads: Load 1, Load 2, and Load 3 are selected. Each of the two methods undergoes many runs of the modeling process. It is evident from Figures 10 and 11 that the Intelligent Fuzzy SAM-GA DRL method offers a quicker convergence for the problem's multiple objective reduction.

a) Fitness evolution graphs for cost minimization under load1

b) Fitness evolution graphs for cost minimization under load2

c) Fitness evolution graphs for cost minimization under load3

Figure 10. Fitness evolution graphs for cost minimization

a) Fitness evolution graphs for PAR reduction under load1

b) Fitness evolution graphs for PAR reduction under load2

c) Fitness evolution graphs for PAR reduction under load3

Figure 11. Fitness evolution graphs for peak-to-average ration reduction

A palette with various components to replicate indoor spaces (such as barriers, furnishings, and entrances) that are furnished with both people (such as elderly people, caregivers, office building employees, etc.) and ubiquitous computing devices (such as actuators and sensor systems shown in Figures 12 and 13. Each room in this scenario has an AP, however the passageway is devoid of any AP. Thus, in this initial case, there are four APs. The second simulated situation is displayed in two and three dimensions in Figure 13 (a) and (b) correspondingly.

The proposed structure was examined during the 600-second simulation period in the smart home scenario. The CO2 and temperature concentrations in the smart house were recorded, and the results were represented in the data shown in Figure 14. The aforementioned figures plot the outcomes of measuring the electricity produced and utilized by the specific residence. The findings demonstrated the highest level of precision and efficiency of the proposed framework for smart HEM systems.

The proposed hybrid model demonstrates superior performance across all metrics, with the highest values in accuracy, precision, recall, and F1 score, indicating its effectiveness in handling complex and dynamic energy management requirements. Table 3 underlines that the proposed hybrid approach achieves a balanced and robust performance, outmatching traditional systems in handling energy management in dynamic and complex environments.

The proposed hybrid model achieves the lowest error rates across all metrics (MAE, MSE, and RMSE), indicating highly accurate predictions and effective energy management. The Rule-Based EMS has the highest error values, reflecting its limited adaptability to dynamic energy changes and lack of optimization in real-time conditions. Optimization-Based EMS reduces error to some extent but due to its static optimization process still results in higher MAE, MSE, and RMSE values compared to the proposed model. Model Predictive Control (MPC) achieves lower error values than the rule-based and optimization-based systems to its predictive capabilities. Table 4 highlights that the proposed hybrid model outperforms traditional systems by minimizing prediction errors, enhancing energy management, and proving effective in dynamic environments.

Figure 12. (a) 2D floor home (b) 3D floor home

(a)

(b)

Figure 13. (a) 2D building (b) 3D building

Figure 14. Analysis of power generation for HEM using proposed system

Table 3. Performance measures

Performance Measure

Proposed Hybrid Model

Rule-based EMS

Optimization Based EMS

FL based MAS

Reinforcement Learnoing –Based EMS

Accuracy (%)

96.7

86.3

89.5

91.2

93.1

Precision (%)

96.2

84.4

88.1

90.5

92.3

Recall (%)

98.1

85.6

88.7

91.3

92.8

F1 Score (%)

97.1

85.0

88.4

90.9

92.6

Table 4. Performance of error measures

Performance Measure

Proposed Hybrid Model

Rule-based EMS

Optimization Based EMS

FL based MAS

Reinforcement Learning –Based EMS

Mean Absolute Error (MAE)

2.5

7.8

6.3

5.7

4.4

Mean Squard Error (MSE)

8.0

21.4

16.6

13.8

11.3

Root Mean Squard Error (RMSE)

2.83

5.51

4.95

4.57

4.21

Table 5. Comparison of training and validation accuracy

Performance Measure

Proposed Hybrid Model

Rule-based EMS

Optimization Based EMS

FL based MAS

Reinforcement Learning –Based EMS

Training Accuracy (%)

98.5

83.0

88.3

90.0

93.5

Validation Accuracy (%)

95.0

81.5

86.1

88.8

92.2

Table 6. Comparison of training and validation loss

Performance

Measure

Proposed Hybrid Model

Rule-based EMS

Optimization Based EMS

FL based MAS

Reinforcement Learning –Based EMS

Training Loss

0.09

0.46

0.31

0.24

0.16

Validation Loss

0.13

0.51

0.37

0.28

0.19

Table 7. Comparison of mutation rate adaptation curves proposed hybrid model rule-based EMS optimization based EMS

EMS Approach

Mutation Rate Adaptation

Convergence Speed

Adaptability

Stability Over Time

Learning/Optimization Dynamics

Proposed Hybrid Model (Fuzzy + SAM-GA + DRL)

Self-adaptive mutation rate, adjusted dynamically based on learning feedback

Fast (≈ 55 generations)

High (adjusts to demand patterns)

High (stable across scenarios)

Combines learning from DRL with dynamic optimization from SAM-GA

Rule-Based EMS

No adaptation (fixed logic)

Slow

Low (depends on predefined rules)

Moderate (rule-dependent)

Static; lacks real-time learning capability

Optimization-Based EMS (e.g., GA/PSO)

Fixed or linearly decaying mutation rate

Medium (≈ 90 generations)

Moderate

Varies (sensitive to parameters)

Relies on fixed mutation/heuristics for exploration

FL  based MAS

Not applicable

Medium

Medium (uses model predictions)

High (model accuracy-dependent)

Optimizes over a finite horizon using pre-built models

Einforcement Learning-Based EMS

Implicit adaptation via policy updates

Fast (≈ 60-70 episodes)

High (policy evolves)

High

Learns from reward signals; no explicit mutation mechanism

Table 8. Performance analysis

Model Type

Inference Time (ms)

Memory Footprint (MB)

Avg. Running Time per Cycle (s)

Suitable for Edge?

Real-Time Feasibility

Proposed Hybrid (Fuzzy + SAM-GA + DRL)

~320 ms

~180 MB

~2.8 s

 Partially (requires optimization)

Moderate (with simplification)

Rule-Based EMS

~50 ms

~10 MB

~0.3 s

√ Yes

√ High

Optimization-Based EMS (e.g., GA, PSO)

~200 ms

~90 MB

~1.9 s

 Limited (offline preferred)

 Medium

Model Predictive Control (MPC)

~120 ms

~60 MB

~1.1 s

√ Yes

√ High

Reinforcement Learning-Based EMS (DQN)

~250 ms

~140 MB

~2.3 s

Moderate

 Medium

The proposed hybrid model shows the highest training and validation accuracy, indicating a strong ability to learn complex patterns and generalize effectively to new data. This suggests that the model performs reliably in both training and real-world settings. Table 5 illustrates that the proposed hybrid approach outperforms traditional models by achieving both high training and validation accuracy, ensuring robust performance for home energy management in real-world applications as shown in Table 6.

This adaptive strategy was monitored across generations, and mutation rate adaptation curves showed a smooth decline in variability as convergence was approached. A comparative study of convergence speed revealed that SAM-GA achieved optimal energy allocation solutions within an average of 55 generations, outperforming standard GA (90 generations) and linearly adaptive GA (70 generations). To assess robustness, the genetic diversity index was also tracked and remained above 0.3 throughout, indicating effective prevention of premature convergence shown in Table 7. These results confirm that the self-adaptive mutation mechanism significantly enhances the algorithm’s ability to explore and exploit the solution space efficiently.

The computational analysis reveals that the proposed hybrid model, combining Fuzzy Logic, SAM-GA, and DRL, incurs the highest average running time per cycle (~2.8 s) and memory footprint (~180 MB), making it only partially suitable for edge deployment without optimization. While its inference time (~320 ms) is acceptable, the complexity of integrating three components necessitates model simplification for real-time applications. In contrast, rule-based EMS offers the fastest response (~0.3 s) and minimal resource consumption, making it ideal for edge devices shown in Table 8. Optimization-based EMS (e.g., GA, PSO) and RL-based EMS (e.g., DQN) strike a trade-off between accuracy and resource usage but are better suited for offline or moderately demanding environments. Model Predictive Control (MPC) achieves a balanced profile with good real-time feasibility and efficient resource utilization. These findings emphasize the need to optimize or compress the hybrid model for practical smart home scenarios.

Limited Real-World Variability in Simulated Dataset

The proposed hybrid energy management model is validated using a simulated dataset, which, although controlled and structured, often lacks the complex variability found in real-world environments. Key concerns include:

Seasonal Energy Patterns: Real homes exhibit fluctuations in energy usage based on season (e.g., heating in winter, cooling in summer), which may not be fully represented in the simulation.

Appliance Faults or Anomalies: Unexpected behaviors such as equipment malfunction, power surges, or manual overrides are typically excluded from synthetic datasets.

Occupancy Variations: The presence and activity level of occupants significantly affect consumption but may be oversimplified in simulations.

Impact: This limitation restricts the generalizability and robustness of the model when deployed in diverse real-world conditions, possibly leading to reduced accuracy or suboptimal decisions in practical scenarios.

Suggested Solutions

Incorporate real-world datasets from smart meter data or IoT-based energy logs.

Simulate additional variability factors like weather, appliance failures, and dynamic occupancy.

Test on multiple household profiles across time periods to enhance validation breadth.

5. Conclusions

The proposed hybrid model integrates Intelligent Fuzzy Logic, a Self-Adaptive Mutated Genetic Algorithm, and Deep Reinforcement Learning, demonstrates notable advancements in the field of efficient home energy management. By combining these intelligent techniques, the model achieves a high degree of adaptability and precision in energy allocation, effectively addressing the complexities of real-time energy management in dynamic home environments. The model’s training success rate of 98.5% and validation accuracy of 95.0% underscore its robust learning capabilities and reliability across diverse scenarios. Furthermore, the model’s error rates, with a Mean Absolute Error (MAE) of 2.5, Mean Squared Error (MSE) of 8.0, and Root Mean Squared Error (RMSE) of 2.83, indicate a substantial improvement over traditional energy management systems. These low error metrics show the model's effectiveness in minimizing energy consumption errors and optimizing appliance scheduling without compromising comfort or convenience. The model’s validation loss of 0.12 further highlights its resilience in handling unforeseen data, outperforming existing methods like Rule-Based EMS, Optimization-Based EMS, Model Predictive Control (MPC), and Reinforcement Learning-based EMS, each of which displayed significantly higher error and loss values. Overall, the hybrid model sets a new benchmark for energy management systems by enhancing operational efficiency, ensuring real-time adaptability, and reducing energy costs. This advanced approach has strong potential for real-world applications, such as reducing energy bills for smart homes and supporting sustainable energy usage, making it a valuable contribution to the ongoing development of intelligent energy management solutions.

  References

[1] Kumar, J., Saxena, D., Kumar, J., Singh, A.K., Vasilakos, A.V. (2024). An adaptive evolutionary neural network model for load management in smart grid environment. IEEE Transactions on Network and Service Management, 22(1): 242-254. https://doi.org/10.1109/TNSM.2024.3470853

[2] Pulluri, H., Basetti, V., Srikanth Goud, B., Kalyan, C.N.S. (2024). Exploring evolutionary algorithms for optimal power flow: A comprehensive review and analysis. Electricity, 5(4): 712-733. https://doi.org/10.3390/electricity5040035

[3] Ali, T., Khan, H.U., Alarfaj, F.K., AlReshoodi, M. (2024). Hybrid deep learning and evolutionary algorithms for accurate cloud workload prediction. Computing, 106(12): 3905-3944. https://doi.org/10.1007/s00607-024-01340-8

[4] Azevedo, B.F., Rocha, A.M.A., Pereira, A.I. (2024). Hybrid approaches to optimization and machine learning methods: A systematic literature review. Machine Learning, 113(7): 4055-4097. https://doi.org/10.1007/s10994-023-06467-x

[5] Zheng, Y.J., Xie, X.C., Zhang, Z.Y., Shi, J.T. (2024). Deep reinforcement learning assisted memetic scheduling of drones for railway catenary deicing. Swarm and Evolutionary Computation, 91: 101719. https://doi.org/10.1016/j.swevo.2024.101719

[6] Li, J., Pang, J., Fan, X. (2024). Optimization of 5G base station coverage based on self-adaptive mutation genetic algorithm. Computer Communications, 225: 83-95. https://doi.org/10.1016/j.comcom.2024.07.002 

[7] Chen, G., Li, L., Chai, Z. (2024). Self-adaptive evolutionary multitasking algorithm for mobile edge computing in internet of things. IEEE Internet of Things Journal, 11(18): 30323-30340. https://doi.org/10.1109/JIOT.2024.3412777

[8] Zhou, Y., Liu, J. (2024). Advances in emerging digital technologies for energy efficiency and energy integration in smart cities. Energy and Buildings, 315: 114289. https://doi.org/10.1016/j.enbuild.2024.114289

[9] Liu, Y., Lei, F. (2024). Design and simulation of cross-border e-commerce customer profile processing system based on improved genetic algorithm. Fourth International Conference on Advanced Algorithms and Signal Image Processing (AASIP 2024), 13269: 231-236. https://doi.org/10.1117/12.3045477

[10] Zhang, Z. (2024). Energy system optimization based on fuzzy decision support system and unstructured data. Energy Informatics, 7(1): 82. https://doi.org/10.1186/s42162-024-00396-2

[11] Lin, Y., Lin, F., Cai, G., Chen, H., Zou, L., Liu, Y., Wu, P. (2025). Evolutionary reinforcement learning: a systematic review and future directions. Mathematics, 13(5): 833. https://doi.org/10.3390/math13050833

[12] Xu, W., Poh, K., Song, S., Huang, Y. (2025). Research on reducing pollutant, improving efficiency and enhancing running safety for 1000 MW coal-fired boiler based on data-driven evolutionary optimization and online retrieval method. Applied Energy, 377: 123958. https://doi.org/10.1016/j.apenergy.2024.123958

[13] Munsi, M.S., Joshi, R.P. (2024). Comprehensive analysis of fuel cell electric vehicles: Challenges, powertrain configurations, and energy management systems. IEEE Access, 12: 145459-145482. https://doi.org/10.1109/ACCESS.2024.3472704

[14] Papari, B., Timilsina, L., Moghassemi, A., Khan, A.A., Arsalan, A., Ozkan, G., Edrington, C.S. (2024). An advanced meta metrics-based approach to assess an appropriate optimization method for Wind/PV/Battery based hybrid AC-DC microgrid. e-Prime-Advances in Electrical Engineering, Electronics and Energy, 9: 100640. http://doi.org/10.1016/j.prime.2024.100640

[15] Ebrie, A.S., Kim, Y.J. (2024). Reinforcement learning-based optimization for power scheduling in a renewable energy connected grid. Renewable Energy, 230: 120886. https://doi.org/10.1016/j.renene.2024.120886

[16] Moghassemi, A., Timilsina, L., Scruggs, D., Arsalan, A., Rahman, S.I., Khan, A.A., Edrington, C.S. (2024). Heuristic evolutionary optimization for control and management of renewable-based hybrid microgrids. In 2024 IEEE Sixth International Conference on DC Microgrids (ICDCM), Columbia, SC, USA, pp. 1-8. https://doi.org/10.1109/ICDCM60322.2024.10664868

[17] Alkhafaji, N., Viana, T., Al-Sherbaz, A. (2024). Integrated genetic algorithm and deep learning approach for effective cyber-attack detection and classification in Industrial Internet of Things (IIoT) environments. Arabian Journal for Science and Engineering, 1-25. https://doi.org/10.1007/s13369-024-09663-6

[18] Bhatti, K.A., Asghar, S., Qureshi, I.A. (2024). Self-adaptive bifold-objective rate optimization algorithm for wireless sensor networks. Simulation Modelling Practice and Theory, 135: 102984. https://doi.org/10.1016/j.simpat.2024.102984

[19] Zhang, F., Li, R., Gong, W. (2024). Deep reinforcement learning-based memetic algorithm for energy-aware flexible job shop scheduling with multi-AGV. Computers & Industrial Engineering, 189: 109917. https://doi.org/10.1016/j.cie.2024.109917

[20] Shiny, S., Beno, M.M. (2024). Dynamic load scheduling and power allocation for energy efficiency and cost reduction in smart grids: An RL-SAL-BWO approach. Peer-to-Peer Networking and Applications, 17(5): 3424-3444. https://doi.org/10.1007/s12083-024-01760-5

[21] Abbasi, A.R., Zadehbagheri, M. (2024). Expansion planning of hybrid electrical and thermal systems using reconfiguration and adaptive bat algorithm. Heliyon, 10(16): e36054. https://doi.org/10.1016/j.heliyon.2024.e36054

[22] Li, X., Hu, K., Li, H., Wang, B., Xu, S., He, Y. (2024). Adaptive hysteresis compensation control of a macro-fiber composite bimorph by improved reinforcement learning. Journal of Intelligent Material Systems and Structures, 35(19): 1471-1482. https://doi.org/10.1177/1045389X241273047

[23] Yang, Y., Zhang, C., Liu, Y., Ning, J., Guo, Y. (2024). Deep reinforcement learning assisted novelty search in Voronoi regions for constrained multi-objective optimization. Swarm and Evolutionary Computation, 91: 101732. https://doi.org/10.1016/j.swevo.2024.101732

[24] Luo, Q., Deng, Q., Zhuang, H., Gong, G., Fan, Q., Liu, X. (2024). Collaborative scheduling of energy-saving spare parts manufacturing and equipment operation strategy using a self-adaptive two-stage memetic algorithm. Robotics and Computer-Integrated Manufacturing, 87: 102707. https://doi.org/10.1016/j.rcim.2023.102707

[25] Zulfiqar, M., Gamage, K.A., Rasheed, M.B., Gould, C. (2024). Optimised deep learning for time-critical load forecasting using LSTM and modified particle swarm optimisation. Energies, 17(22): 5524. https://doi.org/10.3390/en17225524

[26] Akter, A., Zafir, E.I., Dana, N.H., Joysoyal, R., Sarker, S. K., Li, L., Kamwa, I. (2024). A review on microgrid optimization with meta-heuristic techniques: Scopes, trends and recommendation. Energy Strategy Reviews, 51: 101298. https://doi.org/10.1016/j.esr.2024.101298