Fault diagnosis method of traction motor bearings based on optimized weight local kurtosis

Abstract

Non-intrusive bearing electrical signal fault diagnosis can effectively ensure the safe and reliable operation of motors. To address the issues of weak bearing electrical signal fault characteristics, high random interference, poor interpretability and weak adaptive capabilities of existing methods, a method combining linear prediction filtering-based local spectral kurtosis feature extraction with optimized weight local spectral kurtosis for bearing electrical signal fault diagnosis is proposed. In terms of fault feature extraction, a method based on linear prediction adaptive filtering for local kurtosis feature extraction is proposed. This method extracts bearing fault signals through optimal linear filters, modified average periodogram and local spectral kurtosis analysis. In fault diagnosis evaluation, combining expert rules and machine learning algorithms, an adaptive optimized weight local spectral kurtosis-based health factor assessment model is proposed to enhance the robustness and immunity of fault assessment under complex operating conditions and application scenarios. Finally, to validate the correctness and effectiveness of the proposed method, three different degrees of inner ring defective bearings and one normal bearing are selected as test objects, and common feature extraction methods are compared and verified. Experimental results show that compared to other feature extraction methods, the proposed method combining linear prediction adaptive filtering and local spectral kurtosis feature extraction can effectively extract weak bearing fault information. The proposed health factor diagnostic assessment method based on optimized weight local spectral kurtosis can quantitatively assess the bearing's health status, achieve bearing diagnosis and provide graded early warning. This method boasts clear interpretability, high-performance evaluation and robust adaptability.

1. Introduction

The rolling bearings in electric motors serve as critical components supporting rotation. The health status of these bearings directly affects the safety and reliability of the entire motor operation [1]. Therefore, real-time monitoring of the running condition of motor bearings and timely diagnosis of early fault characteristics is of significant importance for ensuring the safe operation of the entire motor system. Currently, the commonly used methods for analysing motor bearings include vibration, temperature, acoustics, oil film resistance and fibre optic monitoring diagnostic techniques [2]. In the field of railway transportation, online detection methods based on temperature signals [3] and vibration signals [4] are the most widely used in fault diagnosis of traction motor bearings. However, it is necessary to retrofit the vehicle body and install temperature and vibration sensors to obtain signals, which poses difficulties in sensor installation and maintenance, and the sensor itself has a high failure rate, often resulting in false alarms.

In order to facilitate the engineering implementation, reduce the implementation cost and maintenance workload and use diagnostic techniques based on electrical signals [5], existing motor control systems such as current and speed sensors are utilized to acquire signals containing bearing fault information. Non-intrusive diagnostics have seen extensive experimentation and research in recent years, leading to the emergence of numerous excellent algorithms. Common methods for diagnosing rolling bearing faults from electrical signals include various time-frequency analysis techniques such as Fourier transform [6], short-time Fourier transform [7] or wavelet transform [8, 9]. Modal decomposition methods mainly include empirical mode decomposition [10–12] and variational mode decomposition [13–15]. Deconvolution methods comprise the minimum entropy deconvolution algorithm [16, 17], maximum average pulse energy ratio deconvolution and maximum correlation kurtosis deconvolution [18–20]. Additionally, with the advancement of big data technology and intelligent algorithms, scholars worldwide combine machine learning algorithms with obtained fault features and intelligent optimization algorithms [21, 22], as well as adaptive neuro-fuzzy systems [23]. By setting appropriate parameters for training, intelligent recognition of bearing faults can be achieved. Moreover, deep learning [24] directly utilizes raw motor data as input, progressively learning data features through multi-layer models, offering valuable insights for intelligent diagnosis of motor bearing faults.

The aforementioned studies represent the primary research methodologies and focal points in the current analysis of electrical signals from the motor stator. Although this method partially addresses the motor bearing diagnosis problem, there are common issues such as high complexity, poor resolution, poor generalization ability and problems like interference from cross-terms or local minimum. Moreover, due to the excessively long transmission path, the bearing fault components in electrical signals are relatively weak. They are also influenced by various factors such as fundamental frequency, inherent eccentricity, inverter output harmonics, load fluctuations and varying sensitivity of fault features with operating conditions. So, extracting weak bearing fault signals still faces significant challenges. To highlight fault feature signals, it is crucial to use noise cancellation technology to suppress major components unrelated to bearing faults. The aforementioned methods either directly extracted the frequency characteristics without noise cancellation in advance or only eliminated the fundamental wave of the power supply with the greatest interference. In actual operation, motor currents contain not only fundamental frequency components but also a large number of odd harmonics, inherent eccentricity harmonics, slot harmonics and switching frequency harmonics, which significantly interfere with bearing fault feature analysis. Furthermore, in industrial production, most recorded data are from normal operations, posing a serious challenge for bearing fault diagnosis under unbalanced data conditions. Additionally, existing deep learning algorithms lack physical interpretability, raising questions about their applicability in various complex environments.

To quantitatively assess the health status of bearings and achieve fault diagnosis and graded early warning of bearings with strong physical interpretability and excellent evaluation performance, this study optimizes both the feature extraction of bearing fault signals and the fault diagnosis methods. We propose a local spectral kurtosis feature extraction method based on linear prediction adaptive filtering, which filters out predictable harmonic components in the stator current while retaining unpredictable harmonic components caused by bearing faults and speed load fluctuations. By integrating spectral estimation with a modified average periodogram and local spectral kurtosis values, the study successfully extracts fault information characteristics from bearing operational signals. Following the extraction of fault features, combining expert knowledge and the maximum likelihood estimation method, this paper develops an adaptive optimized weight algorithm, aiming to achieve high-precision fault classification and regression analysis. To validate the computational results, the paper employs a load simulation on a traction motor bearing fault simulation test bench for urban rail, using three different degrees of inner ring defective bearings and one normal bearing selected as test objects. It also compares with other common feature extraction methods to verify the analysis. This proves that the method combining linear prediction adaptive filtering and local spectral kurtosis feature extraction can effectively extract weak bearing fault information and has superior feature extraction capabilities. The proposed health factor diagnostic assessment method based on optimized weight local spectral kurtosis can quantitatively evaluate the bearing's health status, achieve bearing diagnosis and provide graded early warning, and has the advantages of excellent accuracy and interpretability.

2. Basic principle

2.1. Method of linear filtering-based local spectral kurtosis feature extraction

The stator current of the motor comprises fundamental current, rotor harmonics, intrinsic eccentricity harmonics, slot harmonics and switching frequency harmonics, which are the principal noise sources in bearing fault diagnosis. These noises exhibit periodic characteristics and can be predicted by their frequency, amplitude or initial phase. However, the bearing fault information possesses impulsive features and is unpredictable. Therefore, to address the excessive noise issue during the analysis of current signals, this study employs a linear prediction model to construct representations of the predictable components in the motor current, including the fundamental current and its multiple harmonics, rotor harmonics and other interferences, transforming them into noise target components for prediction.

For the filtered residual current signal, its frequency components can be analysed using power spectra. Power spectral estimation is widely used in analysing the frequency components of stationary random signals and has been successfully applied in practical engineering, such as fault diagnosis. However, to achieve a greater degree of resolution of specific components of the observed signal, it is necessary to have higher power spectral resolution and lower variance. To this end, this paper proposes a modified averaged periodogram method for power spectral density estimation, which ensures sufficient frequency resolution while allowing overlap between data segments to obtain more segments and reduce variance.

In vibration signal diagnosis, kurtosis is used to measure the intensity of impact pulses on bearing surfaces due to fatigue-induced faults. Each revolution generates an impact pulse at the defect location, and the more severe the fault, the greater the amplitude of the impact response, making the fault more pronounced. However, in stator current signals of motors, fault information is relatively weak compared to vibration signals, and traditional kurtosis metrics represent characteristic information at all peak frequencies in the current signal spectrum, failing to effectively reveal the characteristic fault information of bearings. Therefore, based on normalized spectra, this paper employs local spectral kurtosis of bearing fault characteristic frequency bands as fault features to characterize the prominence of fault feature frequency amplitudes compared to surrounding frequency bands.

In summary, this section introduces the schematic diagram of the local spectral kurtosis feature extraction method based on linear prediction adaptive filtering as shown in Fig. 1. This method primarily encompasses three main steps: adaptive filtering based on linear prediction, spectral estimation based on the modified average periodogram and fault feature extraction based on local spectral kurtosis.

Fig. 1.

Schematic diagram of the local spectral kurtosis feature extraction method based on linear prediction filtering.

Open in new tab Download slide

2.1.1. Adaptive filtering method based on linear prediction

To address the issue of excessive noise in the analysis of current signals, this section adopts a linear prediction model to represent predictable components of the motor current, such as the fundamental wave and its multiple harmonics, rotor harmonics and other interferences. These components are then transformed into noise target components for prediction. In addition, the current residual information is obtained by combining the original signal with the prediction result. This residual information includes current phase information and other unpredictable components, such as bearing fault information and background Gaussian noise. Therefore, the stator current of a motor with a bearing fault is represented as

$$\begin{eqnarray} I(t) = {I_{{\rm{up}}}}(t) + {I_{{\rm{pr}}}}(t) \end{eqnarray}$$

(1)

where I_up represents the random component of the stator current, that is the unpredictable component, including bearing fault information and background Gaussian noise; I_pr represents the deterministic component of the stator current, which can be predicted, including the fundamental current and its multiple harmonics, rotor harmonics and other interferences.

The prediction model based on linear regression is widely popular because its parameter calculation boils down to simple linear equations. Not only is it easy to use but it is also suitable for characterizing a spectrum with a narrow frequency range. This type of prediction model is based on an autoregressive model, whose output is determined by the weighted sum of the current input and a series of previous outputs. It can be expressed by the following difference equation:

$$\begin{eqnarray} y[n] = - \sum\limits_{k = 1}^p {{a_k}x[{n - k}]} + {G_w}[n] \end{eqnarray}$$

(2)

where x represents the observed signal sequence; y represents the target value obtained from the comprehensive prediction calculation of p previous values, which is determined by the weighted sum of the current input and past outputs, and the result is calculated based on the difference equation; p represents the prediction order of the model, with a numerical value being an integer; n represents the total amount of observed signal sequence data; G_w[n] represents the prediction error of the model; G represents the system gain of the model; and a_k represents the calculation parameters of the model, with k = 1, 2, ……, p.

2.1.2. Spectral estimation method based on modified average periodogram

For the current components denoised by the linear prediction model, the obtained signal is often not ideal but random, due to the presence of various interferences and noise. These signals have the characteristics of uncertainty, aperiodicity and unpredictable amplitude. However, they usually conform to certain statistical properties. For such random signals, their frequency components can be analysed by power spectrum. Due to the rough power spectrum and low-frequency resolution characteristics of the traditional periodogram method, this paper proposes a modified average periodogram method to implement power spectral density estimation. On the basis of the modified average periodogram method, this approach improves spectral estimation resolution, variance and issues with spectral leakage by applying techniques such as data segment overlap, Hamming windows, logarithmic spectrum and other methods. While ensuring adequate frequency resolution, this method allows for overlapping sections between data segments to obtain a greater number of segments, thereby reducing variance and simultaneously endowing the spectral analysis with a certain level of inheritance.

Assuming that the length of data in deterministic time series x(n) is N, this chapter divides the original sequence into L segments, each segment having a 50% overlap rate and a data length of M. Each segment is subjected to a Fourier transform after applying a Hamming window. Consequently, the power spectrum for each segment is expressed as follows:

$$\begin{eqnarray} {G_l}(K) = 2{\left( {\frac{{|{X_l}(K)|}}{M}} \right)^2},\ K = 1,\ \cdots ,\ \frac{M}{2} - 1 \end{eqnarray}$$

(3)

It is necessary to further normalize the frequency resolution df to obtain the power spectral density, which is expressed by:

$$\begin{eqnarray} {P_l}(K) = \frac{{{G_l}(K)}}{{df}},\ df{\rm{ = }}\frac{{{f_{\rm{s}}}}}{M} \end{eqnarray}$$

(4)

where f_s represents the sampling frequency. The final power spectrum is obtained by averaging the power spectral density of each segment, which is expressed as:

$$\begin{eqnarray} P(K) = \frac{{\sum\limits_{l = 1}^L {{P_l}(K)} }}{L} \end{eqnarray}$$

(5)

Because the fault characteristic signal contained in the current signal is weak, a variable weight logarithm spectrum PS is adopted to increase the weight coefficient for the frequency component with a small contribution and increase the weight coefficient for the frequency component with a large contribution to highlight the secondary contradiction. The logarithmic spectral unit is usually decibels (dB), PS(K) = 20lg(P(K)).

2.1.3. Fault feature extraction method based on local spectral kurtosis

Traditionally, the kurtosis calculation method is used to make a holistic analysis of the whole data sample to evaluate the impact pulse size caused by bearing failure. The larger the impact response amplitude, the more obvious the fault. However, for the current signal, although it has been filtered and denoised, the signal-to-noise ratio is too low, and the noise interference component is too complex, which makes the residual current signal after filtering have large non-Gaussian noise components in addition to the bearing fault characteristics, and a unified kurtosis index cannot be used to represent the fault information. Instead, the amplitude of the fault frequency component in the local frequency band at each fault frequency point is analysed. Therefore, in this chapter, the local spectral kurtosis of fault feature bands after spectrum normalization is used to characterize fault information, analysing faults based on the prominence of fault characteristics within local frequency bands.

After signal filtering and spectrum estimation, in order to unify the measurement scale of power spectrum amplitude, it is necessary to normalize the power spectrum amplitude, that is, the ratio between each amplitude point and the sum of amplitude:

$$\begin{eqnarray} NPS = \frac{{PS}}{{\sum {PS} }} \end{eqnarray}$$

(6)

To calculate the local spectral kurtosis, it is necessary to select a local frequency band with a bandwidth b_w centred around the fault characteristic frequency f_c. The fault characteristic frequency f_c must be converted to the discrete point d_c, which is closest to its numerical value. For the bandwidth of the given local frequency band converted to the discrete point, the local band bandwidth d_w can be expressed as:

$$\begin{eqnarray} {d_{\rm{w}}} = 2R\left( {\frac{1}{2}{b_{\rm{w}}}\frac{{{D_{\rm{s}}}}}{{{f_{\rm{s}}}}}} \right) \end{eqnarray}$$

(7)

In the formula, R(·) represents the rounding operation, D_s represents the total number of points of the original data corresponding to each segment of the power spectrum and f_s is the sampling frequency. Translated into the discrete domain, it is a local frequency band with a length of d_w, centred on the fault frequency point d_c. The Normalized Power Spectral (NPS) value at this point is denoted as |$\{ {NPS( i )} \},\ i = 1,\ 2,\ \cdots ,\ {d_{\rm{w}}}$|⁠.

Due to the errors caused by speed fluctuations and inaccurate calculation of theoretical fault characteristic frequency in practice, it is impossible to accurately locate the fault frequency point in this local frequency band. Therefore, the peak energy centred on the fault characteristic frequency is adopted as the calculation index. It is assumed that the selected centre bandwidth is η, and the peak energy of this local frequency band g_s is:

$$\begin{eqnarray} {\nu _{\rm{s}}} = \sqrt {\frac{1}{{2\eta + 1}}\sum\limits_{i = {d_{\rm{w}}}/2 + 1 - \eta }^{{d_{\rm{w}}}/2 + 1 + \eta } {NPS{{(i)}^2}} } - \mu {\{ NP{S_i}\} _{i = {d_{\rm{w}}}/2 + 1 - \eta :{d_{\rm{w}}}/2 + 1 + \eta }} \end{eqnarray}$$

(8)

where μ{·} represents the mean value calculation function. The variance value of the local frequency band is calculated from the amplitude of the power spectrum excluding the centre bandwidth g_s as:

$$\begin{eqnarray} {\sigma _{\rm{s}}} &=& E{\{ NPS(i) - \mu {(NPS(i))^2}\} ^{\frac{1}{2}}},\ i = 1,\ \cdots ,\ {d_{\rm{w}}}{\rm{/}}2 - \eta ,\\&&{d_{\rm{w}}}{\rm{/}}2{\rm{ + }}\eta ,\ \cdots ,\ {d_{\rm{w}}} \end{eqnarray}$$

(9)

where E{·} denotes the expectation equation. Therefore, the characteristic frequency component of a motor bearing fault can be characterized by the local spectral kurtosis after the normalized spectrum:

$$\begin{eqnarray} LS{K_{\rm{s}}} = \left\{ \begin{array}{@{}l@{}} v_{\rm{s}}^4/\sigma _{\rm{s}}^4,\ {v_{\rm{s}}} > 0\\ 0, \ {v_{\rm{s}}} \le 0 \end{array} \right. \end{eqnarray}$$

(10)

According to Eq. (10), the local spectral kurtosis of all motor bearing fault characteristic frequency points can be calculated, and each local spectral kurtosis value can represent the fault information of this characteristic frequency point.

2.2. Bearing fault state evaluation method based on optimized weight local spectral kurtosis

Bearing faults are diagnosed and evaluated based on the local spectral kurtosis fault features extracted in the previous section. Affected by the complex route conditions, frequent working condition changes, unstable operation and other factors in the field of rail transit, which lead to excessive current harmonics and other complex disturbances, the representation of bearing faults in the current power spectrum is often random. The main performance is that the position and amplitude of the protrusion at the fault characteristic frequency in the power spectrum of the faulty bearing are irregular, and it is impossible to simply use one or some fault characteristic values as the evaluation standard of the bearing fault severity.

On the basis of obtaining the local spectral kurtosis feature data set of the characteristic frequency range of the normalized spectral estimation, this paper proposes a new statistical index optimization weight method to adaptively extract fault features and constructs a new health factor based on the adaptive optimal weight local spectral kurtosis to evaluate the bearing health state. The health factor has the ability for quantitative evaluation of bearing state and clear physical significance. The bearing fault characteristics can be further enhanced, which not only makes use of the advantages of strong physical interpretability of expert rules but also gives play to the advantages of strong adaptive ability brought by machine learning methods and solves the problems of lack of automatic optimization mechanism, poor generalization ability and poor interpretability of traditional methods. The schematic diagram of the proposed state evaluation method for the local spectral kurtosis of normalized weights is shown in Fig. 2, which mainly includes the solving process of optimized weights and the construction process of health factors.

Fig. 2.

Schematic diagram of the state evaluation method based on optimized weight local spectral kurtosis.

Open in new tab Download slide

2.2.1. Adaptive optimization weight method

Drawing on the idea of maximum linear classification of Support Vector Machines, this paper proposes an adaptive optimization weight algorithm. By establishing a classification hyperplane as the decision surface, the isolation edge between fault and normal is maximized, the objective function of the maximum distance is solved by maximum log-likelihood estimation, and then the optimal weight w* is obtained by iterative solution using the gradient descent method. w* is physically equivalent to the feature difference between fault data and normal data, where a larger positive weight indicates a greater difference, while the negative weight represents the interference feature. The specific steps of the solution include the following four stages: signal collection and preprocessing, construct the local spectral kurtosis feature sets of healthy and faulty samples, weighted feature sets and the optimization of weight solutions.

1) Signal collection and preprocessing
Collect equal-length segments of healthy signals m and faulty signals n. Utilize the local spectral kurtosis feature extraction method of linear filtering to convert the data into a dataset of power spectral amplitude values for m portions of healthy signals {PS_H}^m and n portions of faulty signals {PS_F}ⁿ. Then, normalize the power spectral amplitude values to obtain the {NPS_H}^m and {NPS_F}ⁿ.
2) Construct the local spectral kurtosis feature sets of healthy and faulty samples
Based on the sum of normalized power spectrum data sets of healthy {NPS_H}^m and faulty samples {NPS_F}ⁿ, calculate the local kurtosis feature vectors in the power spectrum of each sample. Select the fault features in the low-frequency band, choosing local kurtosis feature values with strong correlation. The sum of local spectral kurtosis feature sets of healthy samples {LSK_H1…H_a}^m and faulty samples {LSK_F1…F_a}ⁿ are calculated by Eq. (10).
3) Weighted feature set
Establish an optimal separation model based on local spectral kurtosis. By utilizing the local spectral kurtosis datasets under healthy LSK_H = {LSK_H1, …, LSK_H_m} and faulty states LSK_F = {LSK_F1, …, LSK_F_n}, LSK∈R^a^×1, where subscripts H and F represent the health and fault state respectively, and m and n represent the number of corresponding data set samples. In one-dimensional space, LSK_H and LSK_F are actually m+n data points. Since the main difference between the local spectral kurtosis obtained in the health state and the fault state, respectively, is that the value of the local spectral kurtosis in the fault state is larger, the local spectral kurtosis features in the health state and the fault state can be separated by the following linear separation model:
$$\begin{eqnarray} \textit{SWLSK} = {\boldsymbol w^{\rm{T}}}LSK + b = 0 \end{eqnarray}$$
(11)

where |$\boldsymbol w = [ {\boldsymbol w[1],\ \cdots ,\ \boldsymbol w[a]} ] \in {R^{a \times 1}}$| and b is the scalar.

The optimal separation model can be understood as finding the optimal hyperplane so that the separation distance between the two types of points is maximum. The distance formula of dimensional space can be expressed as:

$$\begin{eqnarray} d = \frac{{| {{\boldsymbol w^{\rm{T}}}LSK + b}|}}{{\left\| \boldsymbol w \right\|}},\ \left\| \boldsymbol w \right\| = \sqrt {\sum\nolimits_{n = 1}^a {{{\left| {\boldsymbol w[n]} \right|}^2}} } \end{eqnarray}$$

(12)

where w^TLSK is the vector dot product of w and LSK.

4) Optimization of weight solutions
To solve the linear classification model and obtain the optimal linear classification parameters w*, this paper solves based on the maximum likelihood estimation method:

Assume that the datasets LSK_H and LSK_F are both labelled as |$T = \{ {( {LS{K^j},{y_j}} )| {LS{K^j} \in {R^a},{y_j} \in {R^a},{y_j} \in \{ {0,1} \}} } \}_{j = 1}^{m + n}$|⁠. y_j is the class label. When y_j = 1, |$LS{K^j} \in LS{K_ {\rm F}}$|⁠; when y_j = 0, |$LS{K^j} \in LS{K_{\rm H}}$|⁠. According to the maximum likelihood estimation solution, the probability density of the state label y can be expressed as follows:

$$\begin{eqnarray} P({y = 1|LSK}) &=& \frac{{\exp ({\boldsymbol w^{\rm{T}}}LSK + b)}}{{1 + \exp ({\boldsymbol w^{\rm{T}}}LSK + b)}} = \pi (LSK)\\P({y = 0|LSK}) &=& \frac{{\exp ({\boldsymbol w^{\rm{T}}}LSK + b)}}{{1 + \exp ({\boldsymbol w^{\rm{T}}}LSK + b)}} = 1 - \pi (LSK) \end{eqnarray}$$

(13)

where π(LSK) represents the conditional probability density function of y = 1. When |$P( {y = 1| {LS{K^j}} } ) > P( {y = 0| {LS{K^j}} } )$|⁠. The status label of LSK^j is y_j = 1; conversely, if |$P( {y = 1| {LS{K^j}} } ) \le P( {y = 0| {LS{K^j}} } )$|⁠. The status label of LSK^j is y_j = 0; if the optimal parameters w and b are found, |$P( {y = 1| {LS{K_{{\rm F}n}}} } ) > 0.5 > P( {y = 0| {LS{K_{{\rm H}m}}} } )$|⁠, meaning LSK_F_n can be correctly classified as a fault state; similarly, if |$P( {y = 0| {LS{K_{{\rm H}m}}} } ) > 0.5 > P( {y = 0| {LS{K_{{\rm F}n}}} } )$|⁠, LSK_H_m can be correctly classified as a healthy state. To estimate parameter w, the maximum likelihood estimation model is established as follows:

$$\begin{eqnarray} \mathop {{\rm{arg}}}\limits_{\boldsymbol w} {\rm{max}}\prod\limits_{j = {\rm{1}}}^{m + n} {{{[\pi (LS{K^j})]}^{{y_j}}}{{[1 - \pi (LS{K^j})]}^{1 - {y_j}}}} \end{eqnarray}$$

(14)

By logarithmic operation, the objective function can be converted into the following form:

$$\begin{eqnarray} \mathop {{\rm{arg}}}\limits_{\boldsymbol w} {\rm{ minL(}}\boldsymbol \beta {\rm{)}} = \frac{{\rm{1}}}{{{\rm{2(}}m + n{\rm{)}}}}\sum\limits_{j = 1}^{m + n} {[ - {y_j}{\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j} + \log (1 + \exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j}))} \end{eqnarray}$$

(15)

where |${\boldsymbol x_j} = {[ {{{( {LS{K^j}} )}^{\rm{T}}}1} ]^{\rm{T}}}$|⁠. If the data set cannot be strictly linearly separated, the L2 norm regular term can be introduced into the objective function, and the new objective function can be obtained as follows:

$$\begin{eqnarray} \mathop {{\rm{arg}}}\limits_{\boldsymbol w} {\rm{ minL(}}\boldsymbol \beta {\rm{)}} &=& \frac{{\rm{1}}}{{{\rm{2(}}m + n{\rm{)}}}}\sum\nolimits_{j = 1}^{m + n} {[ - {y_j}{\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j} + \log (1 + \exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j}))]}\\&&+\, \frac{1}{{2({m + n})}}\lambda {\boldsymbol \beta ^{\rm{T}}}\boldsymbol \beta \end{eqnarray}$$

(16)

where λ is a non-negative regularization term coefficient.

The Jacobian matrix of the objective function with respect to β can be derived as follows:

$$\begin{eqnarray} \frac{{\partial {\rm{L(}}\boldsymbol \beta {\rm{)}}}}{{\partial \boldsymbol \beta }} &=& \frac{{\rm{1}}}{{{\rm{2(}}m + {\rm{n)}}}}\sum\limits_{j = 1}^{m + n} {\left[\left(\frac{{\exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j})}}{{1 + \exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j}}} - {y_j}\right){\boldsymbol x_j}\right]} + \frac{1}{{(m + n)}}\lambda \boldsymbol \beta \\&=& \frac{{\rm{1}}}{{{\rm{2(}}m + n{\rm{)}}}}\boldsymbol X(\boldsymbol H - \boldsymbol Y) + \frac{1}{{(m + n)}}\lambda {\boldsymbol \beta ^{\rm{T}}} \end{eqnarray}$$

(17)

The matrices X, H and Y are defined as follows:

$$\begin{eqnarray} \boldsymbol X &=& [{x_1}...{x_j}...{x_{m + n}}] \in {R^{({\rm{a}} + 1) \times (m + n)}},\\\boldsymbol Y &=& {[{y_1}...{y_j}...{y_{m + n}}]^{\rm{T}}} \in {R^{m + n}},\\\boldsymbol H &=& \left[\frac{{\exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_1})}}{{1 + \exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_1})}}...\frac{{\exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j})}}{{1 + \exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_j})}}...\frac{{\exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_{m + n}})}}{{1 + \exp ({\boldsymbol \beta ^{\rm{T}}}{\boldsymbol x_{m + n}})}}\right] \in {R^{m + n}}\\ \end{eqnarray}$$

(18)

Therefore, the minimization problem after adding regularization terms can be solved by the gradient descent method.

2.2.2. Health factor construction method based on optimal weight

According to the optimization weights w* solved in the previous section, in order to achieve optimal separation, different components of the vector w* should meet the following conditions: 1) In order to maximize the distance, components corresponding to fault-sensitive features in LSK should be as large as possible, which can make the distance d as large as possible; 2) In order to maximize the distance, the components corresponding to non-fault-sensitive features in LSK should be as small as possible, which can make the distance d as large as possible; 3) Combined with 1) and 2), it can be seen that the final model obtained with sufficient data actually reflects the fault-sensitive local spectral kurtosis feature, while other interference feature components can be eliminated, so the relevant information of the fault-sensitive local spectral kurtosis feature can be obtained while the health and fault states are optimally classified. Therefore, the influence of random background noise and random interference signals can be eliminated by optimizing weights w*, and the local spectral kurtosis eigenvalues can be screened, and a new bearing state evaluation health factor based on optimized weights can be constructed.

First, retaining the local spectral kurtosis features with positive weights represents the fault-sensitive feature components that are larger than normal faults, while discarding the local spectral kurtosis features with negative weights indicates non-fault-sensitive feature components, assuming that fault features are retained. Percentualize the positive weight values and convert them to the relative fault information values:

$$\begin{eqnarray} RFP({{s_k}}) = \frac{{100 \times {\boldsymbol w^*}({{s_k}})}}{{\sum\nolimits_{j = 1}^s {{\boldsymbol w^*}({{s_j}})} }}\% \end{eqnarray}$$

(19)

In the Equation, s represents the number of retained weight features and RFP(s_k) represents the relative fault information of the weight w*(s_k).

Then, based on the relative fault information, select sub-signals and reconstruct fault signal components. Choose the local spectral kurtosis features corresponding to the top Y% of relative fault information for reconstruction. The reason for selecting the top Y% of features is that some features may contain more noise and fewer fault components, so only the fault features within the top Y% are selected. Y is determined by the user according to their needs, but since selecting the top Y% is relatively rigid, there generally won't be exactly Y% of fault features in the top Y%, so the following process is designed to achieve adaptive selection: arrange the remaining relative fault information quantification values of weights in descending order to obtain a vector in descending order |$[ {RF{P_1},\ RF{P_2},\ \cdots ,\ RF{P_j},\ \cdots ,\ RF{P_s}} ]$|⁠.

Minimize the objective function |$\mathop {\arg }\limits_{{N_1}} \min | {Y\% - \sum\nolimits_{j = 1}^{{N_1}} {RF{P_j}} } |$|⁠. After minimization, N₁ represents the number of remaining weights selected, and the corresponding local spectral kurtosis is selected and retained, while the others are discarded.

Based on the retained optimized weights and filtered local spectral steepness features, an inner product is performed to construct a new health indicator for bearing condition assessment based on optimized weights.

$$\begin{eqnarray} \textit{OWSI}({LS{K_{\rm{s}}}}) = {\left({\boldsymbol w_{\rm{s}}^*} \right)^{\rm{T}}}LS{K_{\rm{s}}} \end{eqnarray}$$

(20)

In the equation, OWSI represents the health factor based on optimized weights, |$\boldsymbol w_{\rm{s}}^*$| represents the optimized weights retained after screening and LSK_s represents the corresponding retained local spectral steepness values after screening.

The newly constructed health factor OWSI has a clear physical meaning of quantitative assessment of the bearing state. Features with greater discrimination between faults and normal states are assigned higher weights, enabling the quantification of bearing fault severity. Moreover, it automatically filters out noise and interference features, further enhancing the discriminative power of bearing fault characteristics. It exhibits strengths in adaptability, good generalization capability and strong interpretability.

3. Bearing fault electrical signal diagnosis process

The fault diagnosis process of traction motor bearing based on optimized weight local kurtosis is shown in Fig. 3.

Fig. 3.

Flowchart of bearing fault electrical signal diagnosis.

Open in new tab Download slide

The specific steps are:

Step 1: Collect several groups of current signals under different working conditions; the length of data collected under the same working conditions is not less than 300×fs points.
Step 2: Filter the original current acquisition signal based on the linear regression prediction model.
Step 3: Perform logarithmic power spectrum analysis based on the modified average periodogram method for residual current signal.
Step 4: The power spectrum is normalized to calculate the local spectral kurtosis of each fault characteristic frequency band.
Step 5: Calculate the health factor based on the local spectral kurtosis of the optimized weights.
1. Collect bearing fault and normal data samples offline, using linear predictive filtering and spectral estimation to obtain two sets of normalized power spectrum data sets;
2. Extract the local spectral kurtosis of normalized power spectrum to obtain the local spectral kurtosis feature set of fault and normal samples;
3. Establish the optimal separation model based on local spectral kurtosis, and solve the optimal weight vector based on maximum likelihood estimation and the gradient descent method;
4. Construct a health indicator for bearing condition assessment based on optimized weights by automatically filtering local spectral steepness features.
Step 6: Quantitative evaluation of bearing status based on health factors to characterize the motor bearing fault status and its severity.

4. Experimental platform and experimental analysis

Fig. 4 shows the simulated traction motor bearing fault test rig used in this study, including the test system, the complementary test system and the signal acquisition system. It is used to simulate the load torque of the real vehicle traction motor. In order to explore the manifestation of bearing fault severity on stator current, and realize bearing fault diagnosis and early warning classification, a bearing from the Swedish Company SKF was selected for this purpose, and three different degrees of inner ring defective bearings were prefabricated. The traction motor in the test system was sequentially set to normal operation and bearing faults, followed by conducting full-load combination tests using the same type of bearings but with varying degrees of severity. The experiment utilized the inner ring faulty bearings at the drive end of the traction motor, including normal, minor fault, moderate fault and severe fault, as shown in Fig. 5.

Fig. 4.

Simulation of traction motor faults on drag test bench.

Open in new tab Download slide

Fig. 5.

Artificially manufactured inner race fault bearings of various severity levels.

Open in new tab Download slide

In order to validate the proposed local spectral kurtosis feature extraction method based on linear predictive filtering, a comparative analysis was conducted between prefabricated faulty inner ring bearings and normal motor bearings. Fig. 6 displays the comparative analysis results of the prefabricated faulty bearing current signals before and after filtering at constant speeds of 1800 RPM and 2500 RPM with no torque under no-load conditions. Among them, the blueline represents the original waveform before the current filtering, and the redline represents the data waveform after the linear prediction filtering. It can be clearly seen that no matter how the working condition changes, the waveform after the filtering is significantly different from the waveform before the noise reduction, and the sine function periodic change rule in the waveform has been completely eliminated, and the amplitude of the waveform decreases significantly.

Fig. 6.

Comparison chart of results before and after noise reduction under 1,800 RPM and 2,500 RPM operating conditions: (a) 1,800 RPM, before noise reduction; (b) 2,500 RPM, before noise reduction; (c) 1,800 RPM, after noise reduction; (d) 2,500 RPM, after noise reduction.

Open in new tab Download slide

On the other hand, comparative analysis is conducted based on the spectral analysis results. Fig. 7 presents the comparison results of stator current signals from normal and faulty bearings using the modified average periodogram method based on linear predictive filtering. The spectral data up to 500 Hz are taken from the conditions of constant speeds of 1,800 RPM and 2,500 RPM, respectively, with no load torque, and compared between the mentioned faulty bearing data and normal bearing data under the same operating conditions. According to the comparison results, no matter how the working condition changes, the normal bearing spectrum is very clean, with smooth variations, minimal amplitude fluctuations and no significant peaks, indicating that the bearing has no fault; conversely, the spectrum plotted from the faulty bearing data exhibits noticeable peaks at certain frequencies corresponding to fault features, while the amplitude fluctuations at the same frequency positions for the corresponding normal bearing data are minimal. Therefore, the method described in this section accurately captures bearing fault features in the current, enabling the differentiation between normal and faulty bearings.

Fig. 7.

Power spectrum comparison chart between normal bearing and faulty bearing under different operating conditions: (a) 1,800 RPM, normal; (b) 2,500 RPM, normal; (c) 1,800 RPM, faulty; (d) 2,500 RPM, faulty.

Open in new tab Download slide

In addition, the effect of the method in this section is compared with that of logFFT transform, power spectrum transforms and comb-based filtering, which has been shown in Figs. 8(a)−(d), and the effect of feature extraction is also analysed in Tables 1 and 2, where f_i, f_r and f_e denote the frequency of the bearing inner ring, the rotating frequency and the current fundamental frequency, respectively. The spectrum results after logarithmic Fourier transform have slight bumps at the bearing fault characteristic frequencies of 99 Hz, 190.2 Hz, 219.2 Hz, 306.6 Hz, 407.4 Hz and 459.2 Hz under 1,800 RPM, and 136.2 Hz, 176.8 Hz, 261.0 Hz, 302.0 Hz, 344.1 Hz, 388.8 Hz and 469.2 Hz under 2,500 RPM, and are much smaller than the amplitude of the current base frequency and their frequency multiplier.

Fig. 8.

Comparison results of different feature extraction methods under 1,800 RPM and 2,500 RPM conditions: (a) 1,800 RPM, logFFT; (b) 2,500 RPM, logFFT; (c) 1,800 RPM, power spectrum analysis; (d) 2,500 RPM, power spectrum analysis; (e) 1,800 RPM, comb filter; (f) 2,500 RPM, comb filter; (g) 1,800 RPM, the proposed method; (h) 2,500 RPM, the proposed method.

Open in new tab Download slide

Table 1.

Open in new tab

Results of feature frequency calculation and local spectral kurtosis results with different signal processing methods under 1,800 RPM operating conditions.

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e− 3f_r	99.0	2.93	2.90	4.32	4.92
2	f_i − 2f_e − 2f_r	128.7	1.94	2.04	3.23	1.90
3	f_i − f_e − 2f_r	190.2	3.00	3.46	3.83	4.76
4	f_i − f_e − f_r	219.2	3.80	3.80	3.81	3.99
5	f_i + f_e − 3f_r	278.7	2.03	1.98	2.39	2.53
6	f_i − f_e + 2f_r	306.6	1.99	3.51	3.94	4.28
7	f_i − f_e + 3f_r	338.7	1.83	1.65	2.40	1.97
8	f_i − 2f_e − 2f_r	407.4	1.90	2.18	2.55	4.00
9	f_i + f_e + 3f_r	459.2	2.84	3.00	3.16	3.97

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e− 3f_r	99.0	2.93	2.90	4.32	4.92
2	f_i − 2f_e − 2f_r	128.7	1.94	2.04	3.23	1.90
3	f_i − f_e − 2f_r	190.2	3.00	3.46	3.83	4.76
4	f_i − f_e − f_r	219.2	3.80	3.80	3.81	3.99
5	f_i + f_e − 3f_r	278.7	2.03	1.98	2.39	2.53
6	f_i − f_e + 2f_r	306.6	1.99	3.51	3.94	4.28
7	f_i − f_e + 3f_r	338.7	1.83	1.65	2.40	1.97
8	f_i − 2f_e − 2f_r	407.4	1.90	2.18	2.55	4.00
9	f_i + f_e + 3f_r	459.2	2.84	3.00	3.16	3.97

Table 1.

Open in new tab

Results of feature frequency calculation and local spectral kurtosis results with different signal processing methods under 1,800 RPM operating conditions.

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e− 3f_r	99.0	2.93	2.90	4.32	4.92
2	f_i − 2f_e − 2f_r	128.7	1.94	2.04	3.23	1.90
3	f_i − f_e − 2f_r	190.2	3.00	3.46	3.83	4.76
4	f_i − f_e − f_r	219.2	3.80	3.80	3.81	3.99
5	f_i + f_e − 3f_r	278.7	2.03	1.98	2.39	2.53
6	f_i − f_e + 2f_r	306.6	1.99	3.51	3.94	4.28
7	f_i − f_e + 3f_r	338.7	1.83	1.65	2.40	1.97
8	f_i − 2f_e − 2f_r	407.4	1.90	2.18	2.55	4.00
9	f_i + f_e + 3f_r	459.2	2.84	3.00	3.16	3.97

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e− 3f_r	99.0	2.93	2.90	4.32	4.92
2	f_i − 2f_e − 2f_r	128.7	1.94	2.04	3.23	1.90
3	f_i − f_e − 2f_r	190.2	3.00	3.46	3.83	4.76
4	f_i − f_e − f_r	219.2	3.80	3.80	3.81	3.99
5	f_i + f_e − 3f_r	278.7	2.03	1.98	2.39	2.53
6	f_i − f_e + 2f_r	306.6	1.99	3.51	3.94	4.28
7	f_i − f_e + 3f_r	338.7	1.83	1.65	2.40	1.97
8	f_i − 2f_e − 2f_r	407.4	1.90	2.18	2.55	4.00
9	f_i + f_e + 3f_r	459.2	2.84	3.00	3.16	3.97

Table 2.

Open in new tab

Results of feature frequency calculation and local spectral kurtosis results with different signal processing methods under 2,500 RPM operating conditions.

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e − 3f_r	136.2	1.97	2.48	3.40	4.27
2	f_i − 2f_e − 2f_r	176.8	2.23	2.63	3.54	4.85
3	f_i − 2f_e − f_r	219.4	2.33	3.16	3.37	1.48
4	f_i − f_e − 2f_r	261.0	1.18	1.70	2.12	4.93
5	f_i − f_e − f_r	302.0	1.76	2.65	2.79	5.00
6	f_i − 2f_e + 2f_r	344.1	2.93	3.03	3.79	5.02
7	f_i − f_e + f_r	388.8	3.26	3.57	3.63	4.97
8	f_i + f_e − 2f_r	428.9	1.96	2.11	2.39	2.07
9	f_i + f_e − f_r	469.2	2.89	3.42	3.44	4.98

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e − 3f_r	136.2	1.97	2.48	3.40	4.27
2	f_i − 2f_e − 2f_r	176.8	2.23	2.63	3.54	4.85
3	f_i − 2f_e − f_r	219.4	2.33	3.16	3.37	1.48
4	f_i − f_e − 2f_r	261.0	1.18	1.70	2.12	4.93
5	f_i − f_e − f_r	302.0	1.76	2.65	2.79	5.00
6	f_i − 2f_e + 2f_r	344.1	2.93	3.03	3.79	5.02
7	f_i − f_e + f_r	388.8	3.26	3.57	3.63	4.97
8	f_i + f_e − 2f_r	428.9	1.96	2.11	2.39	2.07
9	f_i + f_e − f_r	469.2	2.89	3.42	3.44	4.98

Table 2.

Open in new tab

Results of feature frequency calculation and local spectral kurtosis results with different signal processing methods under 2,500 RPM operating conditions.

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e − 3f_r	136.2	1.97	2.48	3.40	4.27
2	f_i − 2f_e − 2f_r	176.8	2.23	2.63	3.54	4.85
3	f_i − 2f_e − f_r	219.4	2.33	3.16	3.37	1.48
4	f_i − f_e − 2f_r	261.0	1.18	1.70	2.12	4.93
5	f_i − f_e − f_r	302.0	1.76	2.65	2.79	5.00
6	f_i − 2f_e + 2f_r	344.1	2.93	3.03	3.79	5.02
7	f_i − f_e + f_r	388.8	3.26	3.57	3.63	4.97
8	f_i + f_e − 2f_r	428.9	1.96	2.11	2.39	2.07
9	f_i + f_e − f_r	469.2	2.89	3.42	3.44	4.98

Serial number	Characteristic frequency	Characteristic frequency value	Local spectral kurtosis
Serial number	Characteristic frequency	Characteristic frequency value	logFFT	Power spectrum	Comb filtering	The proposed method
1	f_i − 2f_e − 3f_r	136.2	1.97	2.48	3.40	4.27
2	f_i − 2f_e − 2f_r	176.8	2.23	2.63	3.54	4.85
3	f_i − 2f_e − f_r	219.4	2.33	3.16	3.37	1.48
4	f_i − f_e − 2f_r	261.0	1.18	1.70	2.12	4.93
5	f_i − f_e − f_r	302.0	1.76	2.65	2.79	5.00
6	f_i − 2f_e + 2f_r	344.1	2.93	3.03	3.79	5.02
7	f_i − f_e + f_r	388.8	3.26	3.57	3.63	4.97
8	f_i + f_e − 2f_r	428.9	1.96	2.11	2.39	2.07
9	f_i + f_e − f_r	469.2	2.89	3.42	3.44	4.98

The spectrum results after power spectrum transformation also have convex characteristics at the same bearing fault characteristic frequency, and the convex effect is more obvious than the spectrum results after ordinary logarithmic Fourier transform, while the amplitude is smaller than the current base frequency and its frequency multiplier. The spectrum results based on comb filtering also have convex characteristics at the same bearing fault characteristic frequency position, and the fundamental frequency and frequency doubling of spectrum species current after filtering are reduced to a certain extent, making the bearing fault characteristic frequency position convex more obvious than the spectrum results after ordinary logarithmic Fourier transform and power spectrum transform. However, the fundamental frequency and its fifth harmonic at 300 Hz under 1,800 RPM and the fourth harmonic at 333.3 Hz under 2,500 RPM, which is near the characteristic frequency position of 306.6 Hz under 1,800 RPM and 344.1 Hz under 2,500 RPM, respectively, for the bearing fault, have not been completely eliminated, which has a certain degree of influence on the amplitude bulge at this position. However, the feature extraction method proposed in this section has more obvious spikes at the same bearing fault characteristic frequency position, and the overall spectrum after linear predictive filtering has a better noise reduction effect compared with the common logFFT, power spectrum and spectrum results based on comb filtering. The overall spectrum noise interference is less and the fundamental frequency and frequency doubling are basically eliminated. The same bearing fault characteristic frequency position has higher amplitude and a more obvious convex effect. On the other hand, the calculation results of the final local spectral kurtosis in Tables 1 and 2 show that the feature extraction method proposed in this paper has the best effect compared with the common logFFT, power spectrum and spectrum results based on comb filtering by the quantitative way, which has consistency with frequency spectrum analysis.

In order to monitor and evaluate the bearing fault status, the local spectral kurtosis at each fault characteristic frequency of moderately faulty bearings and normal bearings under no-load, half-load and full-load conditions at 1,500 RPM is selected as typical cases for analysis, as shown in Fig. 9. The abscissa in Fig. 9 is the ID corresponding to the fault characteristic frequency in Table 3.

Fig. 9.

Local spectral kurtosis of fault feature frequencies under different conditions at 1500 RPM: (a) no-load condition; (b) half-load condition; (c) full-load condition.

Open in new tab Download slide

Table 3.

Open in new tab

Fault characteristic frequency of bearing inner ring faults and their ID.

ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency
1	\|f_s − f_r\|	13	\|f_s + f_i − f_r\|	25	\|f_s − 2f_i\|	37	\|2f_s − 3f_i\|
2	\|f_s + f_r\|	14	\|f_s + f_i\|	26	\|f_s − 2f_i − f_r\|	38	\|2f_s + 2f_i + 3f_r\|
3	\|2f_s − f_i + 3f_r\|	15	\|f_s + f_i + f_r\|	27	\|f_s + 2f_i − 2f_r\|	39	\|f_s − 3f_i + f_r\|
4	\|f_s + 2f_r\|	16	\|2f_s − 2f_i + 3f_r\|	28	\|f_s + 2f_i − f_r\|	40	\|f_s − 3f_i\|
5	\|2f_s − f_i + 2f_r\|	17	\|f_s + f_i + 2f_r\|	29	\|f_s + 2f_i\|	41	\|f_s + 3f_i − 3f_r\|
6	\|f_s + 3f_r\|	18	\|2f_s − 2f_i + 2f_r\|	30	\|f_s + 2f_i + f_r\|	42	\|f_s + 3f_i − 2f_r\|
7	\|2f_s − f_i + f_r\|	19	\|f_s + f_i + 3f_r\|	31	\|2f_s − 3f_i + 3f_r\|	43	\|f_s + 3f_i − f_r\|
8	\|2f_s − f_i\|	20	\|f_s − 2f_i + 3f_r\|	32	\|2f_s + f_i\|	44	\|f_s + 3f_i\|
9	\|f_s − f_i + f_r\|	21	\|2f_s + f_i + 2f_r\|	33	\|2f_s − 3f_i + 2f_r\|	45	\|f_s + 3f_i + f_r\|
10	\|f_s − f_i\|	22	\|2f_s − 2f_i\|	34	\|f_s + 2f_i + 3f_r\|	46	\|2f_s + 3f_i\|
11	\|f_s − f_i − f_r\|	23	\|2f_s + f_i + 3f_r\|	35	\|f_s − 3f_i + 3f_r\|	47	\|f_s + 3f_i + 3f_r\|
12	\|f_s + f_i − 2f_r\|	24	\|f_s − 2f_i + f_r\|	36	\|2f_s + 2f_i + 2f_r\|	48	\|f_s + 3f_i + 3f_r\|

ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency
1	\|f_s − f_r\|	13	\|f_s + f_i − f_r\|	25	\|f_s − 2f_i\|	37	\|2f_s − 3f_i\|
2	\|f_s + f_r\|	14	\|f_s + f_i\|	26	\|f_s − 2f_i − f_r\|	38	\|2f_s + 2f_i + 3f_r\|
3	\|2f_s − f_i + 3f_r\|	15	\|f_s + f_i + f_r\|	27	\|f_s + 2f_i − 2f_r\|	39	\|f_s − 3f_i + f_r\|
4	\|f_s + 2f_r\|	16	\|2f_s − 2f_i + 3f_r\|	28	\|f_s + 2f_i − f_r\|	40	\|f_s − 3f_i\|
5	\|2f_s − f_i + 2f_r\|	17	\|f_s + f_i + 2f_r\|	29	\|f_s + 2f_i\|	41	\|f_s + 3f_i − 3f_r\|
6	\|f_s + 3f_r\|	18	\|2f_s − 2f_i + 2f_r\|	30	\|f_s + 2f_i + f_r\|	42	\|f_s + 3f_i − 2f_r\|
7	\|2f_s − f_i + f_r\|	19	\|f_s + f_i + 3f_r\|	31	\|2f_s − 3f_i + 3f_r\|	43	\|f_s + 3f_i − f_r\|
8	\|2f_s − f_i\|	20	\|f_s − 2f_i + 3f_r\|	32	\|2f_s + f_i\|	44	\|f_s + 3f_i\|
9	\|f_s − f_i + f_r\|	21	\|2f_s + f_i + 2f_r\|	33	\|2f_s − 3f_i + 2f_r\|	45	\|f_s + 3f_i + f_r\|
10	\|f_s − f_i\|	22	\|2f_s − 2f_i\|	34	\|f_s + 2f_i + 3f_r\|	46	\|2f_s + 3f_i\|
11	\|f_s − f_i − f_r\|	23	\|2f_s + f_i + 3f_r\|	35	\|f_s − 3f_i + 3f_r\|	47	\|f_s + 3f_i + 3f_r\|
12	\|f_s + f_i − 2f_r\|	24	\|f_s − 2f_i + f_r\|	36	\|2f_s + 2f_i + 2f_r\|	48	\|f_s + 3f_i + 3f_r\|

Table 3.

Open in new tab

Fault characteristic frequency of bearing inner ring faults and their ID.

ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency
1	\|f_s − f_r\|	13	\|f_s + f_i − f_r\|	25	\|f_s − 2f_i\|	37	\|2f_s − 3f_i\|
2	\|f_s + f_r\|	14	\|f_s + f_i\|	26	\|f_s − 2f_i − f_r\|	38	\|2f_s + 2f_i + 3f_r\|
3	\|2f_s − f_i + 3f_r\|	15	\|f_s + f_i + f_r\|	27	\|f_s + 2f_i − 2f_r\|	39	\|f_s − 3f_i + f_r\|
4	\|f_s + 2f_r\|	16	\|2f_s − 2f_i + 3f_r\|	28	\|f_s + 2f_i − f_r\|	40	\|f_s − 3f_i\|
5	\|2f_s − f_i + 2f_r\|	17	\|f_s + f_i + 2f_r\|	29	\|f_s + 2f_i\|	41	\|f_s + 3f_i − 3f_r\|
6	\|f_s + 3f_r\|	18	\|2f_s − 2f_i + 2f_r\|	30	\|f_s + 2f_i + f_r\|	42	\|f_s + 3f_i − 2f_r\|
7	\|2f_s − f_i + f_r\|	19	\|f_s + f_i + 3f_r\|	31	\|2f_s − 3f_i + 3f_r\|	43	\|f_s + 3f_i − f_r\|
8	\|2f_s − f_i\|	20	\|f_s − 2f_i + 3f_r\|	32	\|2f_s + f_i\|	44	\|f_s + 3f_i\|
9	\|f_s − f_i + f_r\|	21	\|2f_s + f_i + 2f_r\|	33	\|2f_s − 3f_i + 2f_r\|	45	\|f_s + 3f_i + f_r\|
10	\|f_s − f_i\|	22	\|2f_s − 2f_i\|	34	\|f_s + 2f_i + 3f_r\|	46	\|2f_s + 3f_i\|
11	\|f_s − f_i − f_r\|	23	\|2f_s + f_i + 3f_r\|	35	\|f_s − 3f_i + 3f_r\|	47	\|f_s + 3f_i + 3f_r\|
12	\|f_s + f_i − 2f_r\|	24	\|f_s − 2f_i + f_r\|	36	\|2f_s + 2f_i + 2f_r\|	48	\|f_s + 3f_i + 3f_r\|

ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency	ID	Fault characteristic frequency
1	\|f_s − f_r\|	13	\|f_s + f_i − f_r\|	25	\|f_s − 2f_i\|	37	\|2f_s − 3f_i\|
2	\|f_s + f_r\|	14	\|f_s + f_i\|	26	\|f_s − 2f_i − f_r\|	38	\|2f_s + 2f_i + 3f_r\|
3	\|2f_s − f_i + 3f_r\|	15	\|f_s + f_i + f_r\|	27	\|f_s + 2f_i − 2f_r\|	39	\|f_s − 3f_i + f_r\|
4	\|f_s + 2f_r\|	16	\|2f_s − 2f_i + 3f_r\|	28	\|f_s + 2f_i − f_r\|	40	\|f_s − 3f_i\|
5	\|2f_s − f_i + 2f_r\|	17	\|f_s + f_i + 2f_r\|	29	\|f_s + 2f_i\|	41	\|f_s + 3f_i − 3f_r\|
6	\|f_s + 3f_r\|	18	\|2f_s − 2f_i + 2f_r\|	30	\|f_s + 2f_i + f_r\|	42	\|f_s + 3f_i − 2f_r\|
7	\|2f_s − f_i + f_r\|	19	\|f_s + f_i + 3f_r\|	31	\|2f_s − 3f_i + 3f_r\|	43	\|f_s + 3f_i − f_r\|
8	\|2f_s − f_i\|	20	\|f_s − 2f_i + 3f_r\|	32	\|2f_s + f_i\|	44	\|f_s + 3f_i\|
9	\|f_s − f_i + f_r\|	21	\|2f_s + f_i + 2f_r\|	33	\|2f_s − 3f_i + 2f_r\|	45	\|f_s + 3f_i + f_r\|
10	\|f_s − f_i\|	22	\|2f_s − 2f_i\|	34	\|f_s + 2f_i + 3f_r\|	46	\|2f_s + 3f_i\|
11	\|f_s − f_i − f_r\|	23	\|2f_s + f_i + 3f_r\|	35	\|f_s − 3f_i + 3f_r\|	47	\|f_s + 3f_i + 3f_r\|
12	\|f_s + f_i − 2f_r\|	24	\|f_s − 2f_i + f_r\|	36	\|2f_s + 2f_i + 2f_r\|	48	\|f_s + 3f_i + 3f_r\|

The blue curve represents the local spectral kurtosis of the normal bearing, while the red curve represents the local spectral kurtosis of the moderately faulty bearing. It can be observed that at most of the fault characteristic frequencies, the values of local spectral kurtosis for the faulty bearing are significantly larger than those for the normal bearing, mainly concentrated at the feature frequencies of the fundamental frequency modulation. However, at a considerable number of feature frequencies, the local spectral kurtosis of the faulty bearing is similar to or even smaller than that of the normal bearing. In addition, the local spectral kurtosis values at the same characteristic frequency are also different under different load conditions of no load, half load and full load. The locations and magnitudes of peaks in the local kurtosis spectrum of the faulty bearing do not exhibit a clear pattern, sometimes appearing and sometimes not, and varying in magnitude, making it difficult to establish a unified state evaluation threshold solely based on a single or specified local spectral kurtosis indicator.

In order to compare and analyse the application effect of the adaptive optimal weight method proposed in this paper with the application effects of traditional machine learning, this paper first uses shallow neural network as a recognition classifier to identify and classify the data samples of normal bearings and faulty bearings. The network is a two-layer feed-forward network consisting of a hidden layer and an output layer. The hidden layer uses the sigmoid activation function, and the output layer uses the SoftMax activation function. In addition, the hidden layer has 64 neurons, while the number of output neurons is four.

Training is conducted using data samples from no-load and half-load conditions at speeds of 900 RPM, 1,200 RPM, 1,500 RPM, 1,700 RPM, 1,800 RPM, 2,000 RPM, 2,100 RPM, 2,300 RPM and 2,400 RPM, respectively. Based on the training samples, the performance of the test data set and the generalization ability for full-load conditions were verified. There are eight sets of measurement data for each operating condition, including 2-phase current data for 10 seconds. In order to ensure that the sample size of the training and test data set is large enough, the 10-second measurement data collected in the experiment is segmented, with each group of data divided into two sections in the unit of 5 seconds. The collected 2-phase stator currents are taken as two sets of data samples respectively, so that 576 sets (9 × 2 × 8 × 2 × 2, the numbers in this equation from left to right represent the number of speed conditions, load conditions, measurement data samples, segments and current phases) of data samples can be obtained for each bearing (normal bearing, slightly faulty bearing, moderately faulty bearing and severe faulty bearing). The speed and load of these data samples are not distinguished. For each faulty bearing, 80% (461 groups) of the above data samples are randomly selected as training data sets to train the shallow neural network model, and the remaining 20% (115 groups) are used as test data sets to test the performance of the network model. Fig. 10 shows the accuracy curves of the training data and test data for different epochs of the model during the training process.

Fig. 10.

The accuracy of the training data and test data of the model in different epochs.

Open in new tab Download slide

After repeated experiments, the average training accuracy of the model is 98.12%, while the average test accuracy can reach 96.57%. The classification and recognition results of two typical test datasets, Experiments 1 and 2, are shown in Fig. 11. The comprehensive classification accuracy of Experiment 1 is 96.96%, and the comprehensive classification accuracy of Experiment 2 is 96.30%.

Fig. 11.

Classification and recognition results of two typical test data sets: (a) Experiment 1; (b) Experiment 2.

Open in new tab Download slide

In order to verify the applicability of the model to other operating conditions, full-load data under the operating conditions of 900 RPM, 1,200 RPM, 1,500 RPM, 1,700 RPM, 1,800 RPM, 2,000 RPM, 2,100 RPM, 2,300 RPM and 2,400 RPM were also used for model testing, and the processing method of full-load data is consistent with previous experiments. For each faulty bearing, 115 sets of data samples were also selected to test the performance of the network model of the above experiment. The comprehensive accuracy rate of full-load experimental data in Experiment 1 is only 57.6%, while that in Experiment 2 is 61.9%. The accuracy rate of less than 70% cannot meet the accuracy requirements of bearing fault classification and identification. From this point of view, the classification and recognition of machine learning has high requirements on training samples. It performs well in training and testing for specific operating conditions but exhibits poor generalization ability for other operating conditions and lacks interpretability of mechanisms.

Finally, according to the adaptive optimization weight solution method in this paper, normal bearing and moderately faulty bearing data set samples were selected as input. A total of 27 sets of data were obtained at 900 RPM, 1,200 RPM, 1,500 RPM, 1,700 RPM, 1,800 RPM, 2,000 RPM, 2,100 RPM, 2,300 RPM and 2,400 RPM, as well as torque under no-load, half-load and full-load conditions. Each set consisted of eight samples, totaling 144 datasets. The normal bearing experimental data were used as normal samples, while the experimental data of moderate inner ring fault bearings under the same operating conditions were used as fault samples.

This resulted in the calculation of 48 optimized weight distribution maps arranged in order of fault characteristic frequency numbers, as shown in Fig. 12. According to the theory of adaptive weights, larger positive weights indicate that the fault characteristics are more pronounced compared to normal characteristics, which are effective features that need to be retained. Weights near 0 indicate little difference between fault and normal characteristics. Negative weights indicate that the fault characteristics are smaller than normal characteristics, suggesting interference features at the corresponding fault characteristic frequency, which should be excluded.

Fig. 12.

Weights of local spectral kurtosis at various fault feature frequencies in descending order.

Open in new tab Download slide

The solved weight values are arranged in descending order. From Fig. 12, it is evident that the larger weights correspond to fault characteristic numbers 1, 2, 14 and 11, representing fault characteristic frequencies |f_s − f_r|, |f_s + f_r|, |f_s + f_i| and |f_s − f_i|. For the bearing inner ring fault, it is more reflected that the inner ring fault causes the motor air gap eccentricity to increase the amplitude at the rotation frequency, and the torque fluctuation increases at the fault characteristic frequency under the current fundamental frequency modulation. The 10 negative weights are eliminated, and the weight vector with the top 90% weight proportion is selected from the remaining positive weights to calculate the bearing health factor, which is the selection surrounded by the red dotted box in Fig. 12. The 25 weights selected were redistributed in proportion.

Based on the optimized weight vector, the optimal weight health factors of normal bearings, slightly faulty bearings, moderately faulty bearings and severely faulty bearings were calculated, and the health factors under no-load, half-load and full-load conditions were obtained, as shown in Table 4. It can be seen that the optimized weights calculated based on the data of moderately faulty bearings still have strong adaptability to other types of faulty bearings, showing a clear monotonic trend with increasing severity of bearing faults. The bearing failure coefficient can be expressed by the ratio between the change of air gap caused by bearing failure and the length of the air gap. According to the bearing fault size, the failure coefficients of normal bearings, slightly faulty bearings, moderately faulty bearings and seriously faulty bearings are preliminarily defined as 0, 0.2, 0.3 and 0.5, respectively. The health factors in Table 4 are approximated by quadratic functions y = ax²+b, with fittings performed for different load conditions to demonstrate the increasing trend of health factors with increasing fault coefficients. Fig. 13 illustrates the health factors and their fitting curves for bearings with different fault coefficients under no-load, half-load and full-load conditions. The health factor increases monotonically with the increase of bearing fault severity, which further demonstrates the effectiveness of the proposed method.

Fig. 13.

Fitted curves of bearing health factors for different severity levels of faults.

Open in new tab Download slide

Table 4.

Open in new tab

Health factors of bearings under different fault severity levels and loads.

Fault severity levels	Fault coefficient	No load	Half load	Full load
Normal bearing	0	3.344	3.491	3.303
Lightly faulty bearing	0.2	3.793	3.812	3.901
Moderately faulty bearing	0.3	4.379	4.491	4.541
Severely faulty bearing	0.5	6.212	6.316	6.391

Fault severity levels	Fault coefficient	No load	Half load	Full load
Normal bearing	0	3.344	3.491	3.303
Lightly faulty bearing	0.2	3.793	3.812	3.901
Moderately faulty bearing	0.3	4.379	4.491	4.541
Severely faulty bearing	0.5	6.212	6.316	6.391

Table 4.

Open in new tab

Health factors of bearings under different fault severity levels and loads.

Fault severity levels	Fault coefficient	No load	Half load	Full load
Normal bearing	0	3.344	3.491	3.303
Lightly faulty bearing	0.2	3.793	3.812	3.901
Moderately faulty bearing	0.3	4.379	4.491	4.541
Severely faulty bearing	0.5	6.212	6.316	6.391

Fault severity levels	Fault coefficient	No load	Half load	Full load
Normal bearing	0	3.344	3.491	3.303
Lightly faulty bearing	0.2	3.793	3.812	3.901
Moderately faulty bearing	0.3	4.379	4.491	4.541
Severely faulty bearing	0.5	6.212	6.316	6.391

The analysis of bearing fault status evaluation experiments in this section reveals that relying solely on expert rule-based indicator analysis makes it difficult to establish a unified state assessment criterion based purely on a single or specified local spectral kurtosis. Meanwhile, the method based on machine learning has problems such as high requirements on training samples, poor generalization ability and lack of mechanism interpretability. However, the health factor based on adaptive optimal weights proposed in this paper further enhances the bearing fault characteristics. It combines the strong physical interpretability of expert rules with the adaptive capabilities of machine learning methods.

5. Conclusions

This paper proposes a bearing fault diagnosis and state evaluation method based on linear predictive filtering with an optimized weight local spectral kurtosis health factor. This method allows for quantitative assessment of bearing health, enabling bearing diagnosis and graded early warning, with advantages including strong interpretability, good assessment effectiveness and strong adaptive capability. Based on linear prediction technology to filter out noise interference in motor current signals, this paper utilizes the modified average periodogram method to improve spectral estimation accuracy and proposes local spectral kurtosis as a fault characteristic indicator. Compared to other feature extraction methods, the combined approach of linear prediction adaptive filtering and local spectral kurtosis feature extraction can effectively extract subtle bearing fault information, demonstrating a stronger capability in feature extraction. The proposed method assesses the bearing fault status based on adaptive optimized weight local spectral kurtosis health factors. These health factors can further enhance bearing fault characteristics, leveraging the simplicity and clear physical significance of expert rule methods, as well as the strong adaptive capability of machine learning-based intelligent algorithms. This combination improves fault assessment by enhancing robustness and immunity to complex operating conditions and application scenarios.

In regard to future research directions, it is recommended to expand simulation experiments for different types of bearing faults, to quantify the severity of different faults for hierarchical diagnosis and warning, to simulate the entire life cycle of bearings and to accurately predict bearing life. On the other hand, the computational complexity of the bearing fault electrical signal diagnosis method needs to be further improved.

Acknowledgements

This research was mainly supported by the Hunan Science and Technology Innovation Plan project (Grant No. 2024JK2046).

Conflict of interest statement

The authors declare that there are no potential conflicts of interests in the publication of this research output.

References

Riera-Guasp

Antonino-Daviu

Capolino

Advances in electrical machine, power electronic, and drive condition monitoring and fault detection: state of the art

IEEE Trans Ind Electron

2015

;

1746

–

1759

Month:	Total Views:
March 2025	8
April 2025	25
May 2025	4

Article Contents

Fault diagnosis method of traction motor bearings based on optimized weight local kurtosis

Abstract

1. Introduction

2. Basic principle

2.1. Method of linear filtering-based local spectral kurtosis feature extraction

2.1.1. Adaptive filtering method based on linear prediction

2.1.2. Spectral estimation method based on modified average periodogram

2.1.3. Fault feature extraction method based on local spectral kurtosis

2.2. Bearing fault state evaluation method based on optimized weight local spectral kurtosis

2.2.1. Adaptive optimization weight method

2.2.2. Health factor construction method based on optimal weight

3. Bearing fault electrical signal diagnosis process

4. Experimental platform and experimental analysis

5. Conclusions

Acknowledgements

Conflict of interest statement

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Most Read

Latest

This Feature Is Available To Subscribers Only