The study of fault identification of vibration signals from rotating machinery is essential for enhancing industrial production safety. A method combining a capsule network and frequency-slicing wavelet transform is proposed to improve the fault identification accuracy, considering the problem that the original vibration signal of rotating machinery carries multiple noises. The capsule network learning model was also optimized using a dynamic weighting method based on the channel attention mechanism, considering the variable operating conditions of rotating machinery. The dynamic weighting algorithm based on the channel attention mechanism used in the study achieved the highest fault recognition rates, with 99.65%, 99.25%, and 99.90% on sensor 1, sensor 2, and feature fusion data, respectively. Hence, the proposed model for fault identification in rotating machinery vibration signals is superior to other models.

※ The user interface design of www.jsts.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

### Journal Search

## 1. Introduction

In recent decades, China’s scientific and technological development has increased
rapidly, and machinery manufacturing and other fields are gradually becoming more
intelligent and refined ^{[1]}. Rotating machinery is a vital part of the mechanical reserve and is commonly used
in wind power, the chemical industry, and other fields ^{[2]}. On the other hand, rotating machines are subject to long periods of standby operation,
which can cause damage to their core components, such as rolling bearings and rotors,
to varying degrees ^{[3]}. Rotating machinery is also exposed to complex and changing operating environments
and conditions. Frequent maintenance or replacement of critical mechanical components
will increase the consumption of raw materials significantly. This increases the industrial
costs and causes environmental pollution, which deviates from the concept of sustainable
development advocated today. Sustainable development requires the coordinated operation
of the economy, society, and nature. This leads to an important premise that the intensity
of industrial development should consider the sustainable range of the ecosystem,
and not blindly pursue the irreversibility of environmental damage caused by industrial
and economic development. Therefore, a mutually beneficial situation of industrial
development and environmental protection can be achieved based on sustainable development.
Therefore, for rotating machinery, only by identifying the damage and other faults
timely and accurately can a series of additional losses be avoided, reducing the consumption
of materials and the emission of polluting gases or waste ^{[4]}. The main techniques reported for fault diagnosis and signal identification in rotating
machinery are the analysis of vibration signals or the study of traditional data-driven
techniques. These methods aim to separate or strengthen the fault signals by processing,
analyze multiple signals, and classify the faults. On the other hand, these methods
are ineffective when the noise and disturbances in the vibration signals of rotating
machinery or their non-stationary nature are considered. Although there is an increasing
number of methods that rely on deep learning, the existing deep learning is limited
to shallow models and is not effective enough for fault diagnosis and identification.
Therefore, the current study proposes an improved wavelet transform method based on
time-frequency analysis combined with a deep learning model to construct a frequency-slicing
wavelet transform model based on energy occupation coefficients and combines it with
an improved capsule network to improve the fault identification accuracy. The innovation
of this research lies in using a frequency slice wavelet transform based on the energy
ratio to realize signal time-frequency analysis. This avoids the subjectivity of frequency
band selection existing in an ordinary frequency slice wavelet transform and also
enriches the ability of fault signal feature acquisition. In addition, the capsule
network has been used to replace the pooling layer of the CNN to solve the signal
loss problem. Finally, a dynamic weighting method based on the channel attention mechanism
was adopted to characterize the complexity of fault signals more accurately, improve
the adaptability of the model to off-design conditions, and achieve multi-sensor data
fusion.

## 2. Related Work

Qiao et al. designed a prediction model combining wavelet transform and long and short-term
memory neural networks to improve the robustness of power prediction. The combined
model obtained better prediction results for power prediction of multiple generation
methods than traditional methods ^{[5]}. Grobbelaar et al. proposed a wavelet transform-based denoising technique for noise
removal from EEG signals. The method obtained better noise reduction in the performance
experiments of the reference channel ^{[6]}. Zolfaghari et al. designed a wavelet transform method with adaptive performance
and combined it with the random forest intelligence algorithm to obtain a new hybrid
prediction model and achieve the prediction of hydroelectric power production. Khare
et al. aimed to solve the problem of information loss and classification errors in
EEG signal decomposition and designed a wavelet transform technique with automatic
decomposition. The method was more than 95% accurate in predicting the EEG signals
of Parkinson’s patients ^{[8]}. Kamgar R et al. examined the response of soil structures during earthquakes using
wavelet transform methods. Two types of soil were considered for the experimental
setup. The parameter prediction error of the method was less than 5% for both low-level
earthquakes ^{[9]}. Escola et al. concluded that the discrete wavelet transform has advantages in the
time-frequency analysis of speech signals and combined it with IoT technology to obtain
a new method for analyzing audio signals. The results showed that the fusion method
could analyze audio signals from 256 Hz to 6.5 kHz ^{[10]}. Omidvar et al. aimed to analyze the seizure characteristics of epileptic patients
using EEG signals. They extracted the EEG signal features with the help of the discrete
wavelet transform method and combined it with a support vector machine to achieve
signal classification. Chen et al. proposed a structural damage monitoring model combining
a continuous wavelet transform and deep convolutional networks. The experimental results
showed that the method has high accuracy and robustness in damage detection compared
to other machine-learning methods ^{[12]}.

Many scholars have also examined fault signal recognition and abnormal point monitoring.
Gu et al. designed a fusion model incorporating a filtering algorithm, Hilbert transforms,
and a support vector machine to identify the fault signal components of spindle devices.
The model allowed better feature extraction in identifying different types of fault
signals than previous models ^{[13]}. Yao et al. proposed a wheel-to-axle box concurrent fault identification model based
on an efficient neural network and attention mechanism to improve the extraction of
various dimensional features. The algorithm could effectively diagnose wheel-to-shaft
box concurrent faults ^{[14]}. Wang et al. proposed an automatic rolling bearing fault identification method based
on fine composite multiscale discrete entropy to achieve rolling bearing vibration
data input without pre-processing and fault identification output without relying
on external expert empirical knowledge. Their model showed good fault identification
capability, good generalization performance, and good prospects for industrial applications
^{[15]}. Abu-Rub O H et al. proposed a machine learning-based diagnostic scheme for identifying
and characterizing local discharge signals formed by different internal sources in
solid insulation. The accuracy of the proposed machine learning-based diagnosis for
identifying and characterizing the defect size ranged from 96% to 92% ^{[16]}. Mukherjee et al. developed several methods for improved power system protection
algorithms that can eliminate faults as soon as they occur. They reviewed the methods
used by many researchers to develop effective fault diagnosis schemes of several methods
^{[17]}. Zheng developed a simulation model of a short-circuit fault in a low-voltage AC
microgrid to detect and locate microgrid short-circuit faults. The simulation and
experimental results showed that the method could detect and locate microgrid short-circuit
fault areas quickly and accurately ^{[18]}. Hou et al. proposed a distribution network fault identification method by combining
improved, fully integrated empirical pattern decomposition with adaptive noise and
conditional generation adversarial networks to address the fault sample imbalance
problem. Experiments show that the method can effectively learn the distribution characteristics
of the original samples. In addition, it can effectively improve fault identification
accuracy ^{[19]}. Mbagaya et al. proposed an automatic parameter identification method based on a
particle swarm optimization algorithm for identifying the dynamic parameters of rolling
bearings. The mean accuracy of bearing faults was 99.67% and 99.2% for the Case Western
Reserve University and Paderborn University datasets, respectively ^{[20]}. Pu G et al. combined spatial clustering and support vector machine based on an unsupervised
algorithm to monitor network outliers. The method showed excellent performance in
the NSL-KDD data set ^{[21]}. Vanem E et al. proposed a sensor anomaly data-monitoring algorithm based on clustering
to understand the engine operating status in real time. The method had advantages
in condition monitoring ^{[22]}. In order to process large-scale telemetry data, Putina A et al. designed an unsupervised
clustering algorithm for anomaly detection. This method was faster and more accurate
^{[23]}.

As shown above, scholars in the field of wavelet transform focused mainly on power prediction and noise reduction of EEG signals. They have mostly combined them with long and short-term memory neural networks or random forest algorithms but seldom linked a wavelet transform with frequency-slicing. In the direction of fault signal identification, the research was directed mainly toward filtering algorithms or support vector machine models. The problem of fault signal identification in rotating machinery was rarely considered. Therefore, this study examined the problem of fault recognition of vibration signals from rotating machinery using a frequency-sliced wavelet transform and a deep learning model containing a capsule network.

## 3. Frequency-slicing Wavelet Transform and Improved Capsule Network based on Rotating Machinery Vibration Fault Signal Identification Model

### 3.1 Frequency-slicing Wavelet Transform-based Rotating Machinery Vibration Fault Signal Identification Model

Considering that the vibration signal of rotating machinery contains considerable noise interference, the direct input of the original signal for signal recognition without processing will reduce the diagnostic performance of the model for fault signals. A combination of frequency-slicing wavelet transforms, and capsule network will be used to improve the accuracy of the model for fault signal recognition. The frequency-slicing wavelet transform contains the arbitrary time-frequency resolution of the continuous wavelet transform. At the same time, it does not require wavelet basis functions to perform the inverse transform operation but is calculated using a Fourier transform. Assuming that the original signal of the rotating machine vibration is$f(t)$, the Fourier transform corresponding to$f(t)\in L^{2}(R)$ and$f(t)$ is expressed as $\hat{f}(w)$, where$R$ is the set of real numbers and$L^{2}(R)$ is the finite vector space. Thus, the frequency-sliced wavelet transform of this signal can be expressed using Eq. (1).

##### (1)

$ W\left(t,w,\lambda ,\sigma \right)=\frac{1}{2\pi }\lambda \int _{-\infty }^{+\infty }\hat{f}\left(u\right)\hat{p}^{\ast }\left(\frac{u-w}{\sigma }\right)e^{iut}du $In Eq. (1),$\sigma $ and$\lambda $ are the scale factor and energy factor, respectively, and$u$ is the estimated frequency. $t$ and $w$ represent the signal monitoring time and its monitoring frequency, respectively(Ed note: The use of the word ``respectively'' to link two or more groups of words improves the clarity of the sentence.). $\hat{p}$ and $\hat{p}^{\ast }$ is the frequency slice function and its conjugate function, respectively; e is the natural logarithm and $i$ are imaginary units. Eq. (2) is the defining equation for frequency resolution.

where $\Delta w_{p}$ is the width of the frequency window corresponding to the frequency slice function and $\eta _{p}$ is the frequency resolution. A time-frequency resolution factor $K$, denoted as $K=w/\sigma $, is also needed to achieve a controlled sensitivity to the time or frequency dimension of the signal. Therefore, Eq. (2) can be transformed into Eq. (3).

The time-resolved coefficient tends to vary inversely with the frequency resolution, achieving a multi-resolution analysis effect. In general, $\eta _{p}$ {\textless}{\textless} 1. For the frequency-slicing function, it can be considered as a band-pass filter that extends the time-frequency domain. Eq. (4) shows two common frequency-slicing functions.

##### (4)

$ \left\{\begin{array}{l} \hat{p}\left(w\right)=e^{-0.5{w^{2}}}\\ \hat{p}\left(w\right)=\frac{1}{1+w^{2}} \end{array}\right. $The time-to-bandwidth ratio is a common performance indicator for frequency-slicing functions. A smaller time-to-bandwidth ratio means the function shows better signal aggregation performance in the time-frequency domain. Both functions presented in Eq. (4) have a time-to-bandwidth ratio of 0.5. Theoretically, the frequency-slicing wavelet transform can perform better in the time-to-frequency domain of the signal. On the other hand, the practical application shows that the unoptimized frequency-slicing wavelet transform will result in errors in the selection of monitoring frequencies in the face of the noise interference inherent in mechanical vibration signals, affecting the recognition results. The optimal selection of monitoring frequencies was achieved by obtaining the Energy Ratio Based-FSWT (ERB-FSWT) by introducing the Energy Ratio Factor (ERB). Fig. 1 shows a flow diagram of the ERB-FSWT.

As shown in Fig. 1, the energy occupation factor$\varepsilon $ was set to a small constant, and the energy value and cut-off frequency were calculated using Eq. (5). In addition, when the monitoring range was set to $[0,\Delta F]$, the frequency-slicing wavelet transform subdivides the signal frequencies and calculates the time-frequency decomposition coefficients to obtain the time-frequency distribution of the signal in that range. Eq. (5) is a mathematical expression for the energy occupation factor.

where $\Delta F$ is the cut-off frequency; $E_{\Delta F}$ is the energy value from the starting frequency to the cut-off frequency; $E_{sum}$ is the sum of the energy values in the monitored frequency range. The initial monitoring frequency was set to zero as the rotating machinery often vibrates in the low-frequency band rather than the high-frequency band. Fig. 1 presents the frequency-slicing wavelet transform optimized by the energy occupation factor to show that a fault classifier will be used for fault identification in rotating machinery vibration signals. A fault classifier is a deep learning model, usually implemented utilizing a convolutional neural network (CNN). On the other hand, CNNs rely only on several neurons of one for analysis when extracting the data features, resulting in poor classification and recognition performance. In addition, CNNs are weak in learning for affine transformations during convolutional operations, leading to the loss of some signal features. Capsule networks are attracting attention because they can convert single neurons into combined neurons, leaving the interconnection between the data to be processed. In contrast to CNNs, capsule networks use so-called ``capsule vectors'' as the output values of the model, which allows the extraction of a more diverse range of features from the signal data. The network also uses a ‘dynamic routing’ algorithm to form a capsule layer instead of the traditional pooling layer used in CNNs, to reduce the problem of signal data loss. Fig. 2 presents the structure of the capsule network.

The starting capsule layer serves to form a capsule activity vector, as shown in Fig. 2. The digital capsule layer transforms the length information of the capsule vector into probabilistic information using a transformation matrix. The final classification capsule layer is the output of the classification and recognition of the signal. The capsule network retains the same basic convolutional layers as the CNN network. Eq. (6) is a mathematical expression for the convolution operation.

##### (6)

$ x_{k'}^{\left(l\right)}=\phi \left(\sum _{k=1}^{K}w_{k}^{\left(l\right)}\ast x_{k}^{\left(l-1\right)}+b_{k}^{l}\right) $where $x_{k'}^{\left(l\right)}$ is the output value of the $k'$ information feature with the number of layers in the network as $l$. $k$ is the input feature index. $\ast $is the discrete convolution calculation of the network layer $l$ with the network layer $l-1$ on the $k$ feature when the convolution kernel is $k$. $w_{k}^{\left(l\right)}$is the convolution kernel. $b_{k}^{l}$is the bias matrix of the network. $\phi $is the activation function acting on the convolutional output. Common activation functions are the hyperbolic tangent function, the sigmoid function, and the Rectified Linear Unit (ReLU). Considering that the first two activation functions have the negative phenomenon of gradient disappearance when the input value is small, the ReLU function is used as the activation function of the model. Eq. (7) is a mathematical expression of the adjusted linear unit. Eq. (8) shows the expression of the output after activation using this function.

##### (7)

$\begin{align} \text{ReLU}(x)&=\left\{\begin{array}{l} x,x>0\\ 0,x\leq 0 \end{array}\right.\end{align} $Eq. (7) shows that the adjusted linear unit is a segmented linear function. In Eq. (8), $x_{ijk}^{\left(l\right)}$ indicates the output element of the network layer $l$ corresponding to the $(i,j)$ output element when the number of features is$k$. The dynamic routing algorithm in the capsule network consists of the following three processes. Process 1 is multiplying the neurons with their corresponding weights and obtaining the output prediction vector. Eq. (9) is the mathematical expression for process 1.

where $u_{i}$ is the first neuron of the previous layer$i$;$W_{ij}$ is the matrix of neuron weights;$u_{j\left| i\right.}$ is the output prediction vector. Process 2 is used to obtain the total output vector by weighting and summing the output prediction vectors obtained from process 1. Eq. (10) is the mathematical expression for process 2.

where $C_{ij}$ is the coupling coefficient, and$S_{j}$ is the total output vector. Process 3 is a nonlinear compressive mapping transformation of this total output variable, as expressed in Eq. (11).

##### (11)

$ v_{j}=\frac{\left\| S_{j}\right\| ^{2}}{1+\left\| S_{j}\right\| ^{2}}\frac{S_{j}}{\left\| S_{j}\right\| ^{2}} $where $j$ represents the output neuron number. This dynamic routing algorithm avoids the problem of gradient disappearance during network training because the output vector is calculated directly without backpropagation.

### 3.2 Dynamically Weighted Improved Fault Identification Model for Capsule Networks based on the Channel Attention Mechanism (Ed note: An article is not needed as the first word of a title.)

Deep learning fault recognition models, including capsule networks, do not alter their distribution during sample training and sample testing. On the other hand, rotating machinery is highly complex when it is put into operation. The network recognition accuracy and generalization capability will be adversely affected if the unimproved deep learning model is used without considering the variability of the actual operating conditions. At the same time, multiple sensors are used to collect signal data for the monitoring of vibration signals in rotating machinery. To achieve inter-fusion of the multi-sensor collected signals, the study proposes a dynamic weighting method based on the channel attention mechanism to improve the adaptability of the model to complex signals. Table 1 lists the level types and fusion characteristics for the interfusion of information numbers collected by multiple sensors.

The content of Table 1 shows that signal fusion is divided into three levels: low, medium, and high. The input form of the signal fusion is divided into data, feature, and decision types. Suppose the number of sensors used in the signal acquisition of rotating machinery is$m$ and the vibration signal data of all sensors is $x_{i}$. The ERB-FSWT transform then gives a time-frequency image of $y_{i}$. This time-frequency image is then stitched together in the channel layer to obtain a feature map at $Y$. Eq. (12) expresses this feature map.

where $l$ and $w$ indicate the number of network layers. $C$ indicates the channel layer splicing operation symbol. After the channel layer splicing process is complete, the channel scaling operation is also required. Eq. (13) is the mathematical expression for this operation.

##### (13)

$ S^{k}=GAP\left(Y^{k}\right)=avg\left\{Y^{l\times w\times j}\left| \forall Y^{l\times w\times j}\in Y^{j}\right.\right\},\left(1\leq k\leq 3m\right) $where $S^{k}$ indicates the output obtained from the channel scaling operation. $GAP(\cdot )$ denotes the full network average pooling process. $k$ denotes the number of channels. Eq. (14) expresses the channel decay and excitation process.

##### (14)

$ \left\{\begin{array}{l} F_{1}^{j}=FC\left(S\right)=W_{1}^{ij}S^{i}+b_{1}^{i}\\ F_{2}^{j}=FC\left(F_{1}\right)=W_{2}^{ij}F_{1}^{i}+b_{2}^{i} \end{array}\right. $where $F_{1}^{j}$ and $F_{2}^{j}$ are the output of a fully connected layer of 1 and a fully connected layer of 2, respectively. $W_{1}^{ij}$ is the weight matrix. $W_{2}^{ij}$$b_{1}^{i}$and$b_{2}^{i}$ are the bias matrices. The channel decay and excitation process are modeled at the correlation level for all channels obtained by scaling, mainly in a series of two fully connected layers. The result denotes the$F^{j}$$j$ output neuron of the fully connected layer. Eq. (15) expresses the weight value generation.

where $w_{k}$ is the normalized weight value. Sigmod denotes the activation function, which casts the output signal features of the fully connected layer of two between (0,1) by mapping, resulting in a normalized weight value for all channels. Eq. (16) expresses the final dynamic weighting process that needs to be performed.

where $\hat{Y}$ denotes the feature map obtained using dynamic weighting. $scale$is defined as a channel weighting operation.

Fig. 3 presents a flow diagram of the method for fault identification. Its identification consists of the following parts. First, the rotating machine vibration data received via the multi-sensor is pre-processed into raw data. The data is then transformed into a time-frequency image using the ERB-FSWT transform. Subsequently, the time-frequency images are divided into two types: training and test samples. The training samples are then weight normalized via the channel layer, i.e., a weighted fusion operation. The test samples are then fed into the trained model for fault signal identification, and the result is used for analysis.

##### Table 1. Signal fusion level and its characteristic.

## 4. Utility Analysis of a Rotating Machinery Fault Identification Model based on Improved Wavelet Transform and Improved Capsule Network

### 4.1 Analysis of the Results of the Rotating Machinery Vibration Fault Signal Recognition Model based on Frequency-slicing Wavelet Transform

The test rig used for the study contains a motor, a torque measuring unit, a bearing test module, a flywheel, and a load motor. In the fault experiments on rotating machinery, the bearing faults were repeated, and the vibration signals were collected using accelerometers. The sampling time was 5 s, and the sampling frequency was 64 kHz. Four operating conditions and two different datasets were used; the values are listed in Table 2.

The time-frequency image size of the original vibration signal after the ERB-FSWT transformation contains three categories: 32${\times}$32${\times}$3, 64${\times}$64${\times}$3, and 128${\times}$128${\times}$3. Fig. 4 shows the accuracy of the three time-frequency image sizes for fault signal identification.

The D1 test data set achieved 96.1%, 99.0%, and 95.3% for the three time-frequency image sizes (32${\times}$32${\times}$3, 64${\times}$64${\times}$3, and 128${\times}$128${\times}$3), respectively. The D2 test data set achieved 96.9%, 99.1%, and 96.3% for the three time-frequency image sizes, respectively. Both datasets achieved the best recognition results at the time-frequency image size of 64${\times}$64${\times}$3. Therefore, for the ERB-FSWT transform, the model was best suited to the time-frequency image size of 64${\times}$64${\times}$3. Fig. 5 presents the results of the cut-off frequency and the fault signal identification for different energy occupation factors.

The cut-off frequencies of the three fault types increase as the energy occupation factor increases, indicating that the frequency range of signal monitoring is expanding (Fig. 5(a)). From Fig. 5(b), the energy duty factor positively correlates with the test accuracy and the time-frequency feature extraction time. The test accuracy was 100% When the energy occupation factor was 1.0 and the corresponding feature extraction time was 99.7s.

### 4.2 Analysis of the Results of a Dynamically Weighted Improved Capsule Network Fault Identification Model based on the Channel Attention Mechanism

Considering the variable nature of rotating machinery operating conditions, this study proposes a dynamic weighting method based on the channel attention mechanism, aiming at effectively fusing the feature layers of the signal. A mechanical fault simulator was used to obtain the signal sample data, consisting of critical components, such as AC motors, acceleration sensors, couplings, rolling bearings, data acquisition boxes, and frequency converters. The number of sensors was two, which were positioned close to the motor and away from the motor. The motor speed was 2000 rpm, and the sampling frequency was 4 kHz. Fig. 6 shows the diagnosis results of the three fault identification algorithms.

From the results of the different algorithms for fault signal identification in Fig. 6, the CNN, ResNet, and the dynamic weighting algorithm based on the channel attention mechanism were 99.00%, 96.25%, and 99.65% accurate, respectively, for the fault identification of the sensor 1 data. For the sensor 2 data, the CNN, ResNet, and dynamic weighting algorithms based on the channel attention mechanism achieved 97.40%, 95.35%, and 99.25% accuracy, respectively. For fault identification in the feature layer data, the CNN, ResNet, and dynamic weighting algorithms based on the channel attention mechanism achieved 99.60%, 98.10%, and 99.90% accuracy, respectively. Hence, the dynamically weighted algorithm based on the channel attention mechanism achieves the highest fault identification rate for both data sources. This result highlights the superiority of the method in fault identification. Table 3 lists the fault identification results for the five fusion levels.

The fault recognition results for the different fusion levels in Table 3 show that the fault recognition rate is optimal for the feature and basic convolutional layers, with an accuracy of 100%. The recognition accuracy is the next best for the multiscale module, at 99.90%. For the initial capsule layer and the digital capsule layer, the fault recognition accuracy is 99.85% and 99.70%, respectively. The average training time at the feature layer fusion level was also the lowest at 256.44 s. Hence, the model is best using the feature layer framing fusion strategy. Considering that the mechanical simulation of faults is still not sufficiently complex, this study also conducted a simulation experiment of a reduction gearbox to increase the complexity of the rotating machine conditions. This part was set up to determine if the model still has a high fault recognition rate for faults in the case of speed variations. Table 4 lists the fault identification results for both algorithms for the speed variation case.

The fault identification accuracies obtained by the algorithms proposed in the study were all higher than those of the CNN algorithm when the rotational speed was varied (Table 4). For example, at a speed of 1400rpm, the CNN has a fault identification rate of 37.08% for multi-sensor data, compared to 41.67% for the research proposed model, showing 4.59% improvement.

##### Fig. 5. Cut-off frequency and fault signal identification results under different energy proportion coefficients.

##### Table 2. Parameter values under different working conditions.

##### Table 3. Fault identification results at five fusion levels.

##### Table 4. Fault identification results of two algorithms in case of speed change.

## 5. Conclusion

A study of fault diagnosis of rotating machinery is essential to reduce machinery maintenance and increase the service life of machinery. To this end, the study proposes a time-frequency analysis method for vibration signals based on the ERB-FSWT transform and an improved model for capsule networks based on dynamic weighting. The mechanical fault simulation experiments showed that the dynamically weighted algorithm based on the channel attention mechanism achieved the highest fault identification rate, with 99.65%, 99.25%, and 99.90% for sensor 1, sensor 2, and feature fusion data, respectively. For the speed variation experiments, the algorithm proposed in the study obtained higher fault identification accuracy than the CNN algorithm. For example, at 1400rpm, the algorithm used in the study showed 4.59% improvement in accuracy over the CNN algorithm. This result highlights the model for fault identification in rotating machinery vibration signals.

### 6. Funding

The research is supported by: the “QinLan Project” funded by Colleges and universities of Jiangsu Province(Jiang Su Teacher’s Letter [2020] No. 10); The education reform project of “integration of industry and education, school enterprise cooperation” in Suzhou in 2021: exploration and research on the school enterprise collaborative education mode under the background of integration of industry and education - take the “VEICHI Class” of Suzhou Polytechnic Institute of Agriculture as an example (project number: 2021JG104); Suzhou Polytechnic Institute of Agriculture key course team “Maintenance Electrician Technology” (Suzhou Polytechnic Institute of Agriculture Teacher’s Letter [2022] No. 5).

### REFERENCES

## Author

Yaping Zhao obtained a bachelor's degree in mechanical engineering and automation from Soochow University in 2004 and a master's degree in mechanical engineering from Soochow University in 2012. Currently, she serves as an associate professor, director of the teaching and research department, young backbone teacher of Jiangsu Qinglan Engineering, member of Suzhou Society of Mechanical Engineering, senior electrician, examiner, and referee of professional skills competition at the Smart Agriculture School, Suzhou Polytechnic Institute of Agriculture. Mainly engaged in mechanical design, intelligent manufacturing, automation control and other teaching and research work. She has published more than 20 professional papers, authorized 20 patents, compiled and published 4 professional textbooks, and presided over more than 10 scientific research projects at all levels.