Reinforcement Learning for Rolling Bearing Fault Diagnosis—A Comprehensive Review

Reinforcement Learning for Rolling Bearing Fault Diagnosis—A Comprehensive Review

Pratik Jadhav Sairam V A Abhyuday Singh Shrikrishna Kolhar* Smita Mahajan

Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune 412115, India

Corresponding Author Email: 
shrikrishna.kolhar@sitpune.edu.in
Page: 
1185-1193
|
DOI: 
https://doi.org/10.18280/jesa.570425
Received: 
25 June 2024
|
Revised: 
2 August 2024
|
Accepted: 
14 August 2024
|
Available online: 
27 August 2024
| Citation

© 2024 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Automatic fault detection and machine diagnosis play a crucial role in preventive maintenance. This study highlights the importance of fault diagnosis in machinery and emphasizes the benefits of preventive and predictive maintenance strategies. The overviews machine and deep learning techniques, and feature extraction methods for automatic fault diagnosis in rolling bearings. The study discusses the challenges machine and deep learning approaches face, including their limited adaptability to different operational conditions and environmental variations. It also suggests reinforcement learning as a potential automatic rolling bearing fault detection solution. The study differentiates between various reinforcement learning methods, including model-based and model-free approaches, and underscores the advantages of deep reinforcement learning. Furthermore, it evaluates several studies that utilized reinforcement learning for feature optimization, parameter optimization, and addressing class imbalance in rolling bearing fault diagnosis. Lastly, the paper summarizes key findings and proposes future research directions, including integrating reinforcement learning with other machine or deep learning methods and developing new algorithms better suited for large datasets and real-time applications.

Keywords: 

deep learning, fault detection, machine learning, reinforcement learning, deep Q networks, predictive maintenance

1. Introduction

Machine diagnosis has been a vital component in the industry, especially with the advent of Industry 4.0 and the evolution of huge machinery [1]. Machinery is spawned across multiple domains like manufacturing, transportation, healthcare, research, energy management, etc. Diagnosis and identification of faults in these machines are crucial for maintaining such complex and expensive machines. The occurrence of faults in such machinery is unpredictable, which can result in the breakdown of the machinery, resulting in a halt in the process [2]. This can harm manufacturing, transportation, and healthcare, causing severe damage. Hence, the identification of faults in such machines is of high importance.

Preventive maintenance is a standard technique used in the industry to maintain machinery [3]. Regular services at timely intervals can keep the machine components in proper form, preventing faults and maintaining the machine in operating conditions. However, this method has a downfall: timely regular services result in expenses and downtime for the organization. Keeping services at an optimum level (service when needed) would be beneficial to reduce downtime and expense. This is the typical application of preventive maintenance [4].

Predictive maintenance is a data-based decision-making system that identifies machinery faults early to reduce service cost and downtime [5]. The primary data source is from sensors, like vibration, temperature, stress, pressure, humidity, etc. These sensors are attached to the machinery of interest, and the signals are analyzed to predict the occurrence of faults in them early. This method can be better than preventive maintenance, reducing the need for multiple timely services and replacing them with services only when needed [6]. Several machines and deep learning algorithms are trained on large volumes of sensor measurements and are used to identify faults in the target machinery in real time [7].

Many of these sensors produce raw feature values, which need further processing to fit into the machine learning pipeline. Feature extraction techniques can be of three types: time domain, frequency domain, and both time and frequency domain [8]. Time domain features involve statistics like mean, median, variance, skewness, kurtosis, etc. Frequency domain-based features are obtained by applying suitable transformations on the raw signal, like Fourier, Laplace, etc. Sometimes, the time and frequency domain features are combined for improved machine performance and deep learning algorithms [9].

Another approach to processing the raw sensor measurements is the conversion into frequency plots, called spectrograms. The spectrogram is the plot of the frequency component in the signal for each time instance [10]. This results in generating image data from the raw signal, which can be given to deep learning algorithms like convolutional neural networks for processing and classifying faults in the machinery [11]. Through these two approaches, relevant features are extracted from the raw signals obtained from the sensors and provided to the machine, as well as deep learning algorithms for automated fault diagnosis. An extensive literature survey of the machine and deep learning techniques applied to automated fault detection in rolling bearing equipment follows. Figure 1 represents the workflow for automated fault diagnosis for rolling bearings using machine and deep learning techniques.

Figure 1. General flow diagram for automated fault analysis in bearing equipment

2. Rolling Bearing

Rolling bearings minimize friction between moving parts, ensuring smooth and reliable rotation and relative movement of connected parts [12]. By facilitating the transfer of loads and motion, rolling bearings contribute significantly to machinery's overall performance and longevity across various applications [13]. Rolling bearings constitute a critical component in the machinery industry, serving as the cornerstone for the efficient operation of various mechanical systems [14]. Rolling bearings are critical in tribomachinery. Rolling bearings support and transmit power while minimizing friction [15]. Rolling bearings are critical for rotating machinery and directly impact the transmission efficiency of the electrified drive powertrain systems, affecting the vehicle's overall performance [16].

Figure 2. A rolling bearing with internal components

The bearings are made with the inboard and outboard races, bearing the main load-bearing surfaces, while the rolling elements are typically the balls or the rollers that distribute the load evenly across the races. The cage or the retainer guarantees the correct spacing and the position of the moving components so they don't touch each other and their relative place remains intact. They function as a device to keep them away from each other [17]. Furthermore, as seals or shields become impermeable to the bearing, they can protect internal parts of the bearing from outer contaminants such as dust, dirt, and water; otherwise, their functioning might be negatively affected [18]. Diagram II- Rolling bearing disassembly, which indicates the interior components. Figure 2 presents rolling bearings containing internal parts. These are designed and constructed to endure and be robust. Rolling bearings are subject to different kinds of wear and damage during operation. One of the infirmities is fatigue failure, a form of cyclic loading and unloading of the bearing components. Thanks to this cyclic stress over time, it can lead to the development of cracks, spalling of the inner and outer race surfaces, or removal from their structural integrity [19].

Another potential cause of bearing failure is inadequate lubrication, which can increase friction and heat generation within the bearing assembly. Insufficient lubrication may lead to excessive wear and premature failure of the bearing components, particularly the inner and outer races and the rolling elements [20]. Furthermore, contamination from external sources such as dust, debris, or moisture can accelerate wear and corrosion, further exacerbating the risk of bearing failure.

It is approximated that greater than 69% of equipment failure originates with bearings, which can affect the reliable working of electric motor systems [21]. Failure consequences can be significant, which means that bearing failure may lead to such severe outcomes as the expenses of unplanned maintenance work and even workers’ injuries. Unplanned maintenance due to early bearings failure is often a result of emergency shutting down of the machinery, followed by loss of production time, hurried repairs that can cause other complications, and more time off. Emergency maintenance tends to threaten workers' lives; thus, by focusing on preventive maintenance, the risk of workers being injured on the job can be eliminated. Also, the bearing failure damages neighboring equipment, for instance, shafts, bearing housings, and electric motors [22]. To limit this damage, removing the machinery from service as soon as a bearing failure is suspected and inspecting neighboring components for any signs of damage is important. Downtime and bearing failure can lead to production loss, missed deadlines, supply contract issues, revenue loss, and costly repairs, posing significant risks to a business [23].

3. Application of Machine Learning for Bearing Fault Analysis

This section presents a review of the applications of machine and deep learning algorithms for automated fault diagnosis in rolling bearing equipment through vibration analysis. Figure 3 is the bar chart representing the papers on machine learning-based bearing fault detection from 2016 to 2024.

Alonso-González et al. [24] applied envelope analysis to vibration signals from bearing equipment for fault detection using the Case Western Reserve University (CWRU) dataset. Time and frequency domain features were extracted, and machine learning algorithms, including KNN, SVM, naive Bayes, and decision trees, were trained on the features. Patil and Phalle [25] presented a study on ensemble machine-learning models for fault detection in anti-friction bearings. The study involves the use of random forest, gradient boosting, and extra trees classifiers and highlights their significance in feature selection and computation time. The research underlines the significance of accurate feature ranking and classifier tuning to improve diagnostics in anti-friction bearings using ensemble machine-learning methods.

Figure 3. Year-wise count of publications on bearing fault analysis using machine learning approaches

Adamsab [26] investigated classification techniques for fault detection in rotating machines. This paper focuses on some of the most widely used machine learning algorithms, including the support vector machine (SVM), the artificial neural network (ANN), the decision tree (DT), and the k-nearest neighbor (k-NN) for fault detection and diagnosis. It also underlines the importance of early defect identification to reduce operating costs and mitigate failures. There are also SVR and RVM methods for fault diagnosis. RVM is preferred as its output is probabilistic values. The paper also discusses the feature selection and parameter optimization issues of SVM. In conclusion, the study focuses on applying machines to perform the proper maintenance and reduce the failure times of machines in industries. Goyal et al. [27] introduced a new approach to diagnosing bearing defects, incorporating sophisticated signal processing and AI. Optical accelerometer technology captures the signals of the vibrations of machines and processes with the help of the Hilbert transform. Principal component analysis (PCA) and sequential forward floating selection (SFFS) feature selection methods filter the redundant information. The paper utilizes an artificial neural network (ANN) and support vector machine (SVM) for classification purposes. As for the potential use of non-contact sensors, this research depicts the effectiveness of ANNs over other prevalent supervised machine learning algorithms in bearing failures’ classification.

Piltan et al. [28] suggested a new approach for diagnosing the rolling-bearing issues using an expert system-based machine learning-based observer (ESMO). The fuzzy SMO method, paved with advanced algorithms and decision trees, improves fault detection accuracy but reveals inaccuracies at certain times. The above approach gives good classification results and an acceptable method of identifying trouble with the rolling-element bearings for varying operating conditions. Another study describes using deep learning techniques in bearing fault diagnostics based on CWRU dataset [29]. The paper explains the hierarchical adaptive deep conviction network (ADCNN) trained with LeNet5. The need to use data augmentation techniques to enhance the efficiency of deep learning in identifying defects is highlighted. The study also points out the suitability of the deep learning approach in recognizing machinery faults, particularly in handling large volumes of data and setting a benchmark for fault diagnosis and prognosis. Overall, it points to the significant potential of deep learning in diagnosing machinery problems. Overall, it points to the significant potential of deep learning in diagnosing machinery problems.

4. Challenges Faced by Machine and Deep Learning Methods

The traditional machine and deep learning algorithms produce excellent results in the singular datasets, indicating great performance in single operating conditions. However, these algorithms must perform better in real-world situations with multiple operating conditions. The operating conditions refer to the variation in the torque, frequency, and the make of the rolling bearing. The conventional algorithms trained on one such type of sensor measurement might not be able to generalize on other sensor measurements [7]. Another issue observed with the conventional machine and deep learning algorithm is the need for adaptability to environmental variations, including sensor values due to vibrations, noise by external factors, human interference, and so on. The dataset used to train these algorithms is pre-processed and cleaned, removing these erroneous values. Hence, these models need to perform better in real-time situations [30]. The machine learning algorithms are trained on datasets with minimal fault modes; however, many fault modes can occur in real time, and these algorithms might not be exposed to such faults, producing erroneous values in the due result.

To detect bearing faults using classical ML algorithms, characteristic fault frequencies are calculated based on rotor speed and bearing geometry, which serve as fault features. However, challenges such as sliding, frequency interplay, external vibration, observability, and sensitivity can affect classification accuracy [31]. The challenges in applying DL algorithms to real-world applications include difficulty transferring knowledge from lab setups to real-world scenarios of naturally occurring faults, limited accurately labeled data for training the algorithms, particularly for faults that evolve naturally over time, imbalance in data, as it can be challenging to collect sufficient faulty condition data for effective training, dealing with noisy data from industrial scenarios, as environmental vibrations and noise can affect the performance of DL algorithms trained on vibration data [31, 32].

5. Reinforcement Learning

One of the promising solutions to the above-mentioned problem is reinforcement learning. These algorithms interact with the environment, learn in due process, and correct based on the reward/feedback. Hence, powerful algorithms are obtained with high adaptability and can perform well under different operating conditions [33, 34].

Reinforcement learning is a branch of machine learning where the algorithm interacts with the environment and seeks to take actions by maximizing the rewards. The algorithm, named as agent, interacts with the environment and takes actions correspondingly. A reward or punishment is decided based on the action and returned to the agent. Through multiple training iterations, the agent learns continuously through interaction with the environment based on its actions and experiences. In due process, the agent corrects itself and takes the correct actions to maximize the reward [35]. This type of learning helps the algorithm adapt to different situations in the environment, and this method can be a potent solution to the lack of adaptability and generalization faced by conventional machine and deep learning algorithms. Figure 4 represents the flowchart of the taxonomy of reinforcement learning algorithms.

Figure 4. Taxonomy of reinforcement learning techniques

The techniques in reinforcement learning fall into two types, namely, model-based and model-free techniques. Model-based techniques have a mathematical model/equation to determine the value of each state and the corresponding reward. The probability values associated with each possible action are provided, known as the state transition probability matrix [36]. Dynamic programming and bellman equations fall into this category, in which the value of the current state is based on the value of the next state along with the discounted reward. These techniques are useful for environments with small states and full observability [37].

Model-free techniques do not rely on mathematical models/equations to determine the state value and reward. In such a scenario, a policy is generated and updated based on evaluation. Some classic techniques include Monte Carlo learning [38], which samples multiple complete experiences and takes their empirical mean to determine the state value and reward function. Another standard technique is the temporal difference (TD) [39], which learns from incomplete experiences, updating the state value based on the temporal difference between future state values. The Monte Carlo is an on-policy approach, whereas the Temporal Difference is an off-policy approach.

Deep reinforcement learning techniques have been gaining popularity in recent years due to the integration of reinforcement learning principles with deep neural networks. These neural networks automatically estimate the mathematical relationships between the state values and rewards [40]. Some standard techniques in deep reinforcement learning include policy gradient, actor-critic network, controller sub-controller network, deep Q learning, etc.

The policy gradient is a primitive approach in deep reinforcement learning. This technique uses the gradient descent approach to fine-tune and update the agent's policy. The gradient descent is used to optimize the policy to overcome the effects of instability and degradation, as observed in the Markov decision process. A random policy is generated, and the agent interacts with the environment, takes action, and gets the reward. Based on the action probabilities and rewards, the policy gradient is estimated and updated similarly to the gradient descent technique of updating the parameters in a neural network [41].

The deep Q-learning algorithm is the deep learning version of Q-learning, where a deep neural network is used to estimate the Q-values (state-action pair). The neural network takes the state value as input and outputs the probabilities for all possible actions, using the softmax activation function. This method can be highly useful and reliable for environments with many states and multiple actions where the manual formulation of the Q-table could be more convenient. Some versions of DQN use two neural networks of the same architecture, where one network is used for target prediction, and another is used for updating the parameters. The gradient descent technique optimizes the neural network, gradually finding the correct probabilities for all actions and maximizing rewards in due process. However, in some situations, the neural network might converge in local minima and produce a non-linear Q function most of the time. To overcome this problem, the experience replay technique was proposed. In this technique, the agent stores an experience (state, action, and reward triplet) and replays it in further iterations of training [42].

The actor-critic network is fundamental and popularly used in deep reinforcement research. As the name suggests, there are two neural networks: actor and critic network. The actor-network determines the probability of all possible actions, whereas the critic network evaluates the decisions made by the actor-network. The policy gradient technique is applied to the actor-network, and the value function-based reward and state value estimation is applied to the critic network [43]. Actor-critic networks can be applied to real-time problems with continuous action spaces. However, implementing actor-critic networks is computationally expensive as two networks must be trained simultaneously.

The controller sub-controller network is yet another application of deep reinforcement learning. This is a set of two neural networks, where the controller is like the parent, which determines the parameters to be considered, and the sub-controllers are the children, which develop the neural network based on the parameters advised by the controller. The developed neural networks get trained on the data, and the accuracy or equivalent performance metric is estimated. The reward is calculated by comparing the metric in the current state to that of the previous state. Based on the positive/negative reward obtained, the controller network decides better neural network parameters, gradually improving performance [44].

6. Reinforcement Learning Applications for Rolling Bearing Fault Diagnosis

Around 30 papers related to applying reinforcement learning techniques on rolling bearing fault diagnosis have been identified for the proposed review. Figure 5 represents the year-wise count of papers on reinforcement learning applications in rolling bearing fault analysis.

Figure 5. Line plot representing the year-wise publication on reinforcement learning approaches for rolling bearing fault diagnosis

These papers have been categorized into three objectives: RL techniques for feature optimization, RL techniques for parameter optimization, and RL techniques for handling class imbalance. Figure 6 represents the flowchart of the different RL techniques and objectives related to rolling bearing fault diagnosis application.

Figure 6. Classification of reinforcement learning algorithms as per objective of bearing fault diagnosis

6.1 Reinforcement learning techniques for feature optimization

The objective of feature optimization is to develop reinforcement learning techniques to learn the most effective features and store them in local or online mode, and then iteratively repeat for multiple operational modes. The agent learns and adapts to multiple operational conditions in due process, improving its accuracy and other evaluation metrics. Wang et al. [45] worked on a system comprised of a deep learning network and reinforcement learning to improve the bearing fault classification accuracy where a neural network with attention layers captures intricate vibration features, and a deep Q network (DQN) was used with the neural network as the agent to improve the classification accuracy. The self-learning and self-improvising algorithm aimed to maximize the short-term and long-term cumulative rewards. Hence, the proposed system successfully acquired the best features from the sensory data and improved classification accuracy despite having multiple variations and noisy data. Li et al. [46] developed an online domain adaption framework using CapNet and deep reinforcement learning techniques for rolling bearing fault identification. The CapNet algorithm was used for automated feature extraction from sensory data, and deep reinforcement learning techniques improved the system's flexibility under different operational modes.

In another study [47], a reinforcement learning-based ensemble transfer learning network was used to improve the classification accuracy of rolling bearing faults. The ensemble unsupervised technique was used to develop the best reward function and to implement the ensemble model for multi-source, multi-target datasets. Wang et al. [48] proposed a novel method for multi-label fault recognition using deep reinforcement learning and curriculum learning. They implemented the proximal policy optimization (PPO) method and demonstrated the methodology in two roller-bearing experiments, proving it more accurate than traditional methods. In another study, Kang et al. [49] proposed the dual experience model using deep reinforcement learning for fault diagnosis of roller bearings with an unbalanced dataset. The model combines deep reinforcement learning with a dual-experience pool structure to address the challenges caused by unbalanced data in diagnosing faults. It effectively copes with the imbalance data challenge and shows potential for high accuracy and efficiency in fault diagnosis in mechanical systems.

Wang et al. [50] focus on gathering multi-source information, utilizing self-attention mechanisms and deep reinforcement learning algorithms to improve fault diagnosis in mechanical systems. They use the machine-learning MSIF feature extraction method and a sliding window attention mechanism to address differences in information between sources. The proposed architecture covers failure monitoring of bearing defects and pantographic tools, leading to more efficient equipment, shorter downtimes, and better maintenance decisions. Qian and Liu [51] developed CNN-RL and GRU-RL models to handle preprocessed vibration spectrum data. Spatial and temporal features from CNN and GRU are further processed using fully connected layers ending with a softmax output layer for fault diagnosis. DQN network takes the output and completes the process by assigning Q-values to potential fault actions based on extracted features. Zheng et al. [52] proposed that the deep reinforcement learning (DRL) method be applied to estimate the remaining useful life (RUL) of rolling bearings. This approach combines deep learning (DL) for feature extraction and TD3 for sequential decision-making. State construction is a significant architectural part of this technique, where degradation information is extracted from raw vibration signals using a ResNet-based Autoencoder (AE). Consequently, the state vector is the expected lifetime and degraded information predicted by each instant. The TD3 algorithm allows for continuous action spaces, making it possible to be stable enough even when applied continuously.

6.2 Reinforcement learning techniques for parameter optimization

Parameter optimization, or reinforcement learning-based neural architecture search (RL-NAS), aims to identify the optimal parameters in deep learning for improved performance. Wang et al. [53] integrated reinforcement learning techniques to enhance neural architecture search (NAS) for rolling bearing fault diagnosis. Their proposed system includes a controller algorithm and child networks to update the parameters of convolutional neural networks. Similarly, Zhou et al. [54] developed an RL-NAS system with a controller, sub-controller, and child components, utilizing a three-layer MLP as the controller for efficiency. The algorithm incorporates greedy search, experience relay, and weight-sharing for convergence and efficiency.

Wang et al. [55] used reinforcement learning with a 1D convolutional network to detect compound faults in rolling bearings. They segmented and stacked three channels (x, y, and z) for feature extraction, then passed onto a 1D CNN. The DRL algorithm was the actor-critic algorithm that improved CNN's performance in classifying compound-bearing faults.

Ding et al. [56] implemented deep Q-learning for intelligent fault analysis in rotatory machinery. Their approach used a sparse autoencoder (SAE) for anomaly detection and reinforcement learning in a game format. This improved the SAE network's decision-making, resulting in enhanced classification accuracy. Cao et al. [57] present a unique method for fault diagnosis network structure design using deep learning. They use reinforcement learning to assist in the automatic process and apply Pareto Optimization to reduce the multi-dimensionality of the model. They suggest two approaches for the RL algorithm, proximal policy optimization (PPO) and actor-critic (AC), which appear convenient in network engineering. Finally, they chose the NAS-PERIRB algorithm for its features of independence and challenging the auto-creation of neural network structure. Wen et al. [58] proposed an LSTM–DDPG framework that uses an actor-network driven by LSTM to adjust CNN hyperparameters and a critic network to guide policy optimization. Experimental results demonstrate its effectiveness compared to four state-of-the-art hyperparameter optimization (HPO) and traditional ML/DL methods. However, the framework has limitations, requiring re-training of the DRL agent after each run. Future research could explore NAS and transfer learning for higher efficiency and better performance.

Wang et al. [59] introduced the deep dual reinforcement learning network (DDRLN), a sophisticated framework designed to solve the problem of fault quantitative identification with limited samples. The architecture consists of two main components: the actor and critic models. The actor model uses convolutional and fully connected layers for identifying unknown samples and extracting essential features. In contrast, the critic model focuses on optimizing a Q-learning function to estimate expected future rewards. An experience storage unit is integrated to address the lack of samples, allowing the model to improve prediction accuracy in fault diagnosis across different industrial settings.

6.3 Reinforcement learning for handling class imbalance

The objective of handling class imbalance is to develop reinforcement learning techniques to overcome the issue of class imbalance. The distribution of datasets used in fault diagnosis is usually skewed. Traditional machine and deep learning algorithms work well with balanced datasets, whereas with imbalanced datasets, these algorithms tend to overfit the majority classes compared to minority classes. The deep reinforcement learning-based approach treats this as a sequential problem. The agent receives the state of the environment, takes a diagnosis action guided by a policy, and receives a positive/negative reward based on the correctness of the action. Higher rewards are given to minority classes compared to majority classes. The proposed approach performs better than traditional machine learning methods on imbalanced datasets [60].

In another study, Yang et al. [61] proposed a framework with a deep reinforcement learning model based on double deep Q networks integrated with transfer learning. This framework efficiently extracts fault features and accurately classifies fault types using bearing vibration signal data. The proposed networks showed a 13% improvement with imbalanced data compared to the traditional DDQN framework. Li et al. [62] proposed a diagnostic framework for deep reinforcement learning (DRL) to detect faults in rolling bearings. It uses the DenseNet model and advantage actor-critic (A2C) algorithm for intelligent decision-making. The model includes modifications to improve performance and adopts the synthetic minority over-sampling technique (SMOTE) to handle class imbalance problems.

7. Conclusions

The paper reviews the applications of reinforcement learning (RL) in automated fault diagnosis of rolling bearing equipment. Classic machine and deep learning algorithms prove to be useful in multiple specific operating conditions but face issues in generalizing to diverse real-world scenarios. RL addresses these challenges by allowing agents to interact with their environment, learn from experience, and adapt to different circumstances.

The study the use of various RL methods, including model-free and model-based approaches, deep reinforcement learning, and generic algorithms like deep Q-learning, actor-critic networks, and controller sub-controller networks in fault detection. The study illustrated the use of RL techniques in three rolling bearing fault diagnosis domains: feature selection, parameter optimization, and addressing class imbalances. The review indicates that RL can be used to determine useful features, optimize model parameters, and handle imbalanced datasets.

RL presents a promising approach to building reliable fault detection systems for rolling bearing machines. Its ability to learn from diverse data sources and adapt to new situations makes it useful for improving the accuracy and reliability of automated fault detection in real-world conditions. Additionally, combining RL with other AI techniques, such as machine learning and deep learning, holds significant potential for advancing the field. Consequently, further research should focus on creating hybrid systems that leverage RL with other methods to ensure better fault diagnosis performance and robustness. Future research directions could involve integrating RL into spectrogram analysis for bearing fault detection and multimodal feature fusion for enhanced diagnosis.

  References

[1] Angelopoulos, A., Michailidis, E.T., Nomikos, N., Trakadas, P., Hatziefremidis, A., Voliotis, S., Zahariadis, T. (2019). Tackling faults in the industry 4.0 era—A survey of machine-learning solutions and key aspects. Sensors, 20(1): 109. https://doi.org/10.3390/s20010109

[2] Isermann, R. (2011). Fault-Diagnosis Applications: Model-Based Condition Monitoring: Actuators, Drives, Machinery, Plants, Sensors, and Fault-Tolerant Systems. Springer Science & Business Media.

[3] Mobley, R.K. (2002). An Introduction to Predictive Maintenance. Elsevier.

[4] Eti, M.C., Ogaji, S.O.T., Probert, S.D. (2006). Development and implementation of preventive-maintenance practices in Nigerian industries. Applied Energy, 83(10): 1163-1179. https://doi.org/10.1016/j.apenergy.2006.01.001

[5] Ran, Y., Zhou, X., Lin, P., Wen, Y., Deng, R. (2019). A survey of predictive maintenance: Systems, purposes and approaches. arXiv preprint arXiv:1912.07383. https://doi.org/10.48550/arXiv.1912.07383

[6] Pech, M., Vrchota, J., Bednář, J. (2021). Predictive maintenance and intelligent sensors in smart factory. Sensors, 21(4): 1470. https://doi.org/10.3390/s21041470

[7] Saufi, S.R., Ahmad, Z.A.B., Leong, M.S., Lim, M.H. (2019). Challenges and opportunities of deep learning models for machinery fault detection and diagnosis: A review. IEEE Access, 7: 122644-122662. https://doi.org/10.1109/ACCESS.2019.2938227

[8] Zhang, C., Mousavi, A.A., Masri, S.F., Gholipour, G., Yan, K., Li, X. (2022). Vibration feature extraction using signal processing techniques for structural health monitoring: A review. Mechanical Systems and Signal Processing, 177: 109175. https://doi.org/10.1016/j.ymssp.2022.109175

[9] Miller, C.A. (2013). Intelligent Feature Selection Techniques for Pattern Classification of Time-Domain Signals. The College of William and Mary.

[10] Boashash, B. (2015). Time-Frequency Signal Analysis and Processing: A Comprehensive Reference. Academic Press.

[11] Souza, R.M., Nascimento, E.G., Miranda, U.A., Silva, W.J., Lepikson, H.A. (2021). Deep learning for diagnosis and classification of faults in industrial rotating machinery. Computers & Industrial Engineering, 153: 107060. https://doi.org/10.1016/j.cie.2020.107060

[12] Marghitu, D., Diaconescu, C.I., Craciunoiu, N. (2001). 5 Machine components. In Mechanical Engineer's Handbook, pp. 243-337.

[13] Rejith, R., Kesavan, D., Chakravarthy, P., Murty, S.N. (2023). Bearings for aerospace applications. Tribology International, 181: 108312. https://doi.org/10.1016/j.triboint.2023.108312

[14] Yang, H. (2005). Automatic fault diagnosis of rolling element bearings using wavelet based pursuit features. Doctoral dissertation, Queensland University of Technology.

[15] El Laithy, M., Wang, L., Harvey, T.J., Vierneusel, B., Correns, M., Blass, T. (2019). Further understanding of rolling contact fatigue in rolling element bearings-A review. Tribology International, 140: 105849. https://doi.org/10.1016/j.triboint.2019.105849

[16] Pengbo, Z., Renxiang, C., Xiangyang, X., Lixia, Y., Mengyu, R. (2023). Recent progress and prospective evaluation of fault diagnosis strategies for electrified drive powertrains: A comprehensive review. Measurement, 222: 113711. https://doi.org/10.1016/j.measurement.2023.113711

[17] Marmol, M. (2022). Development of a new bearing geometry to reduce friction losses. Technische Universität Kaiserslautern.

[18] Moerman, F., Kastelein, J. (2014). Hygienic design and maintenance of equipment. In Food Safety Management, pp. 673-739. https://doi.org/10.1016/B978-0-12-381504-0.00026-3

[19] Gegner, J. (2011). Tribological aspects of rolling bearing failures. Tribology-Lubricants and Lubrication, 33-94.

[20] Vencl, A., Gašić, V., Stojanović, B. (2017). Fault tree analysis of most common rolling bearing tribological failures. In IOP Conference Series: Materials Science and Engineering, 174(1): 012048. https://doi.org/10.1088/1757-899X/174/1/012048

[21] He, F., Xie, G., Luo, J. (2020). Electrical bearing failures in electric vehicles. Friction, 8: 4-28. https://doi.org/10.1007/s40544-019-0356-5

[22] Xu, F., Ding, N., Li, N., Liu, L., Hou, N., Xu, N., Guo, W.M., Tian, L., Xu, H., Wu, C., Wu, X., Chen, X. (2023). A review of bearing failure Modes, mechanisms and causes. Engineering Failure Analysis, 152: 107518. https://doi.org/10.1016/j.engfailanal.2023.107518

[23] Pal, U., Palit, P., Gokarn, P., Kanrar, S. (2020). Failure analysis of ball bearing of conveyor: Overusage. Journal of Failure Analysis and Prevention, 20: 1992-2002. https://doi.org/10.1007/s11668-020-01014-5

[24] Alonso-González, M., Díaz, V.G., Pérez, B.L., G-Bustelo, B.C.P., Anzola, J.P. (2023). Bearing fault diagnosis with envelope analysis and machine learning approaches using CWRU dataset. IEEE Access, 11: 57796-57805. https://doi.org/10.1109/ACCESS.2023.3283466

[25] Patil, S., Phalle, V. (2018). Fault detection of anti-friction bearing using ensemble machine learning methods. International Journal of Engineering, 31(11): 1972-1981.

[26] Adamsab, K. (2021). Machine learning algorithms for rotating machinery bearing fault diagnostics. Materials Today: Proceedings, 44: 4931-4933. https://doi.org/10.1016/j.matpr.2020.12.050

[27] Goyal, D., Dhami, S.S., Pabla, B.S. (2020). Non-contact fault diagnosis of bearings in machine learning environment. IEEE Sensors Journal, 20(9): 4816-4823. https://doi.org/10.1109/JSEN.2020.2964633

[28] Piltan, F., Prosvirin, A.E., Jeong, I., Im, K., Kim, J.M. (2019). Rolling-element bearing fault diagnosis using advanced machine learning-based observer. Applied Sciences, 9(24): 5404. https://doi.org/10.3390/app9245404

[29] Neupane, D., Seok, J. (2020). Bearing fault detection and diagnosis using case western reserve university dataset with deep learning approaches: A review. IEEE Access, 8: 93155-93178. https://doi.org/10.1109/ACCESS.2020.2990528

[30] Avci, O., Abdeljaber, O., Kiranyaz, S., Hussein, M., Gabbouj, M., Inman, D.J. (2021). A review of vibration-based damage detection in civil structures: From traditional methods to Machine Learning and Deep Learning applications. Mechanical Systems and Signal Processing, 147: 107077. https://doi.org/10.1016/j.ymssp.2020.107077

[31] Zhang, S., Zhang, S., Wang, B., Habetler, T.G. (2020). Deep learning algorithms for bearing fault diagnostics—A comprehensive review. IEEE Access, 8: 29857-29881. https://doi.org/10.1109/ACCESS.2020.2972859

[32] Hakim, M., Omran, A.A.B., Ahmed, A.N., Al-Waily, M., Abdellatif, A. (2023). A systematic review of rolling bearing fault diagnoses based on deep learning and transfer learning: Taxonomy, overview, application, open challenges, weaknesses and recommendations. Ain Shams Engineering Journal, 14(4): 101945. https://doi.org/10.1016/j.asej.2022.101945

[33] Rodríguez, M.L.R., Kubler, S., de Giorgio, A., Cordy, M., Robert, J., Le Traon, Y. (2022). Multi-agent deep reinforcement learning based predictive maintenance on parallel machines. Robotics and Computer-Integrated Manufacturing, 78: 102406. https://doi.org/10.1016/j.rcim.2022.102406

[34] Naeem, M., Rizvi, S.T.H., Coronato, A. (2020). A gentle introduction to reinforcement learning and its application in different fields. IEEE Access, 8: 209320-209344. https://doi.org/10.1109/ACCESS.2020.3038605

[35] Luong, N.C., Hoang, D.T., Gong, S., Niyato, D., Wang, P., Liang, Y.C., Kim, D.I. (2019). Applications of deep reinforcement learning in communications and networking: A survey. IEEE Communications Surveys & Tutorials, 21(4): 3133-3174. https://doi.org/10.1109/COMST.2019.2916583

[36] Kaiser, L., Babaeizadeh, M., Milos, P., Osinski, B., Campbell, R.H., Czechowski, K., Erhan, D., Finn, C., Kozakowski, P., Levine, S. and Mohiuddin, A. (2019). Model-based reinforcement learning for Atari. arXiv preprint arXiv:1903.00374. https://doi.org/10.48550/arXiv.1903.00374

[37] Busoniu, L., Babuska, R., De Schutter, B., Ernst, D. (2017). Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press.

[38] Bouzy, B., Chaslot, G. (2006). Monte-Carlo Go reinforcement learning experiments. In 2006 IEEE Symposium on Computational Intelligence and Games, pp. 187-194. https://doi.org/10.1109/CIG.2006.311699

[39] Menache, I., Mannor, S., Shimkin, N. (2005). Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134(1): 215-238. https://doi.org/10.1007/s10479-005-5732-z

[40] Li, Y. (2017). Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274. https://doi.org/10.48550/arXiv.1701.07274

[41] Sutton, R.S., McAllester, D., Singh, S., Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems.

[42] Van Hasselt, H., Guez, A., Silver, D. (2016). Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.10295

[43] Jang, H.C., Huang, Y.C., Chiu, H.A. (2020). A study on the effectiveness of A2C and A3C reinforcement learning in parking space search in urban areas problem. In 2020 International Conference on Information and Communication Technology Convergence (ICTC), pp. 567-571. https://doi.org/10.1109/ICTC49870.2020.9289269

[44] Zhang, Z., Ma, L., Poularakis, K., Leung, K.K., Tucker, J., Swami, A. (2019). MACS: Deep reinforcement learning based SDN controller synchronization policy design. In 2019 IEEE 27th International Conference on Network Protocols (ICNP), pp. 1-11. https://doi.org/10.1109/ICNP.2019.8888034

[45] Wang, R., Jiang, H., Zhu, K., Wang, Y., Liu, C. (2022). A deep feature enhanced reinforcement learning method for rolling bearing fault diagnosis. Advanced Engineering Informatics, 54: 101750. https://doi.org/10.1016/j.aei.2022.101750

[46] Li, G., Wu, J., Deng, C., Xu, X., Shao, X. (2021). Deep reinforcement learning-based online domain adaptation method for fault diagnosis of rotating machinery. IEEE/ASME Transactions on Mechatronics, 27(5): 2796-2805. https://doi.org/10.1109/TMECH.2021.3124415

[47] Li, X., Jiang, H., Xie, M., Wang, T., Wang, R., Wu, Z. (2022). A reinforcement ensemble deep transfer learning network for rolling bearing fault diagnosis with multi-source domains. Advanced Engineering Informatics, 51: 101480. https://doi.org/10.1016/j.aei.2021.101480

[48] Wang, Z., Xuan, J., Shi, T. (2022). Multi-label fault recognition framework using deep reinforcement learning and curriculum learning mechanism. Advanced Engineering Informatics, 54: 101773. https://doi.org/10.1016/j.aei.2022.101773

[49] Kang, Y., Chen, G., Pan, W., Wei, X., Wang, H., He, Z. (2023). A dual-experience pool deep reinforcement learning method and its application in fault diagnosis of rolling bearing with unbalanced data. Journal of Mechanical Science and Technology, 37(6): 2715-2726. https://doi.org/10.1007/s12206-023-0501-y

[50] Wang, Z., Xuan, J., Shi, T. (2023). Multi-source information fusion deep self-attention reinforcement learning framework for multi-label compound fault recognition. Mechanism and Machine Theory, 179: 105090. https://doi.org/10.1016/j.mechmachtheory.2022.105090

[51] Qian, G., Liu, J. (2022). Development of deep reinforcement learning-based fault diagnosis method for rotating machinery in nuclear power plants. Progress in Nuclear Energy, 152: 104401. https://doi.org/10.1016/j.pnucene.2022.104401

[52] Zheng, G., Li, Y., Zhou, Z., Yan, R. (2024). A remaining useful life prediction method of rolling bearings based on deep reinforcement learning. IEEE Internet of Things Journal, 11(13): 22938-22949. https://doi.org/10.1109/JIOT.2024.3363610

[53] Wang, R., Jiang, H., Li, X., Liu, S. (2020). A reinforcement neural architecture search method for rolling bearing fault diagnosis. Measurement, 154: 107417. https://doi.org/10.1016/j.measurement.2019.107417

[54] Zhou, J., Zheng, L., Wang, Y., Wang, C., Gao, R.X. (2022). Automated model generation for machinery fault diagnosis based on reinforcement learning and neural architecture search. IEEE Transactions on Instrumentation and Measurement, 71: 1-12. https://doi.org/10.1109/TIM.2022.3141166

[55] Wang, Z., Xuan, J. (2021). Intelligent fault recognition framework by using deep reinforcement learning with one dimension convolution and improved actor-critic algorithm. Advanced Engineering Informatics, 49: 101315. https://doi.org/10.1016/j.aei.2021.101315

[56] Ding, Y., Ma, L., Ma, J., Suo, M., Tao, L., Cheng, Y., Lu, C. (2019). Intelligent fault diagnosis for rotating machinery using deep Q-network based health state classification: A deep reinforcement learning approach. Advanced Engineering Informatics, 42: 100977. https://doi.org/10.1016/j.aei.2019.100977

[57] Cao, J., Ma, J., Huang, D., Yu, P. (2022). Finding the optimal multilayer network structure through reinforcement learning in fault diagnosis. Measurement, 188: 110377. https://doi.org/10.1016/j.measurement.2021.110377

[58] Wen, L., Wang, Y., Li, X. (2022). A new automatic convolutional neural network based on deep reinforcement learning for fault diagnosis. Frontiers of Mechanical Engineering, 17(2): 17. https://doi.org/10.1007/s11465-022-0673-7

[59] Wang, R., Jiang, H., Li, X., Yao, P. (2021). A network structure search method based on reinforcement learning for rolling bearing fault diagnosis. In 2021 Global Reliability and Prognostics and Health Management (PHM-Nanjing), pp. 1-5. https://doi.org/10.1109/PHM-Nanjing52125.2021.9612909

[60] Zhong, X., Zhang, L., Ban, H. (2023). Deep reinforcement learning for class imbalance fault diagnosis of equipment in nuclear power plants. Annals of Nuclear Energy, 184: 109685. https://doi.org/10.1016/j.anucene.2023.109685

[61] Yang, D., Karimi, H.R., Pawelczyk, M. (2023). A new intelligent fault diagnosis framework for rotating machinery based on deep transfer reinforcement learning. Control Engineering Practice, 134: 105475. https://doi.org/10.1016/j.conengprac.2023.105475

[62] Li, Y., Wang, Y., Zhao, X., Chen, Z. (2024). A deep reinforcement learning-based intelligent fault diagnosis framework for rolling bearings under imbalanced datasets. Control Engineering Practice, 145: 105845. https://doi.org/10.1016/j.conengprac.2024.105845