Probing three-dimensional magnetic fields: II – an interpretable Convolutional Neural Network | Monthly Notices of the Royal Astronomical Society

ABSTRACT

Observing 3D magnetic fields, including orientation and strength, within the interstellar medium is vital but notoriously difficult. However, recent advances in our understanding of anisotropic magnetohydrodynamic (MHD) turbulence demonstrate that MHD turbulence and 3D magnetic fields leave their imprints on the intensity features of spectroscopic observations. Leveraging these theoretical frameworks, we propose a novel Convolutional Neural Network (CNN) model to extract this embedded information, enabling the probe of 3D magnetic fields. This model examines the plane-of-the-sky magnetic field orientation (ϕ), the magnetic field’s inclination angle (γ) relative to the line-of-sight, and the total magnetization level (M|$_{\rm A}^{-1}$|⁠) of the cloud. We train the model using synthetic emission lines of ¹³CO (J = 1–0) and C¹⁸O (J = 1–0), generated from 3D MHD simulations that span conditions from sub-Alfvénic to super-Alfvénic molecular clouds. Our tests confirm that the CNN model effectively reconstructs the 3D magnetic field topology and magnetization. The median uncertainties are under 5° for both ϕ and γ, and less than 0.2 for M_A in sub-Alfvénic conditions (M_A ≈ 0.5). In super-Alfvénic scenarios (M_A ≈ 2.0), they are under 15° for ϕ and γ, and 1.5 for M_A. We applied this trained CNN model to the L1478 molecular cloud. Results show a strong agreement between the CNN-predicted magnetic field orientation and that derived from Planck 353 GHz polarization. The CNN approach enabled us to construct the 3D magnetic field map for L1478, revealing a global inclination angle of ≈76° and a global M_A of ≈1.07.

turbulence, ISM: general, ISM: magnetic field, ISM: structure

1 INTRODUCTION

In the vast interstellar medium (ISM), magnetic fields are pervasive powers that significantly influence various astrophysical phenomena. These fields serve as invisible balancers against gravitational forces within the ISM, intricately maintaining its equilibrium (Wurster & Li 2018; Abbate et al. 2020). They are instrumental in directing gas flows towards galactic nuclei, playing a crucial role in their sustenance and the dynamic processes unfolding therein (Kim & Stone 2012; Roche et al. 2018; Busquet 2020; Whittingham et al. 2021; Hu et al. 2022c). Magnetic fields also govern the trajectories of cosmic rays, affecting the energy distribution and overall dynamics of ISM (Fermi 1949; Jokipii 1966; Yan & Lazarian 2002, 2004; Ferrand & Marcowith 2010; Xu & Yan 2013; Xu & Lazarian 2020; Hopkins et al. 2021; Hu, Lazarian & Xu 2022b). Furthermore, they are deeply involved in the star formation processes within molecular clouds, influencing both the rate and nature of star births (Mestel 1965; Mac Low & Klessen 2004; McKee & Ostriker 2007; Federrath & Klessen 2012; Lazarian, Esquivel & Crutcher 2012; Hu, Lazarian & Stanimirović 2021b). Despite their pivotal roles, our understanding of these magnetic fields remains far from complete.

Our primary challenge lies in the formidable task of probing a three-dimensional (3D) magnetic field in 3D spatial space. Current approaches – such as polarized dust emission (Lazarian 2007; Andersson, Lazarian & Vaillancourt 2015; Planck Collaboration et al. 2015; Fissel et al. 2016; Planck Collaboration et al. 2020b; Li et al. 2021; Liu, Hu & Lazarian 2023a), polarized synchrotron emission (Xiao et al. 2008; Planck Collaboration et al. 2016; Guan et al. 2021), provide 2D measurements of the plane-of-sky (POS) magnetic field direction, while Zeeman splitting (Crutcher 2004, 2012) and Faraday rotation (Haverkorn 2007; Taylor, Stil & Sunstrum 2009; Oppermann et al. 2012; Xu & Zhang 2016; Tahani et al. 2019) provide line-of-sight (LOS) components of the magnetic field. Yielding valuable insights, these techniques probe into distinct and typically different regions of the multiphase ISM. Thus, despite their individual strengths, merging these insights into a coherent, full 3D magnetic field vector, which includes both the 3D orientation and total strength, presents a non-trivial task.

A significant advance in probing the 3D magnetic fields in molecular clouds has been made by leveraging polarized dust emission, drawing on the depolarization effect induced by different magnetic field orientations (see Chen et al. 2019) and by accounting for the properties of turbulent magnetic fields Hu & Lazarian 2023a, c). As a separate development, Tahani et al. (2019, 2022) has succeeded in employing the synergy of Faraday rotation and dust polarization to infer a helical 3D magnetic field topology across the Orion A, Orion B, Perseus, and California clouds. Subsequently, Hu, Xu & Lazarian (2021a) and Hu, Lazarian & Xu (2021c) proposed the use of anisotropic properties of magnetohydrodynamic (MHD) turbulence, inherited by young stellar objects (Ha et al. 2022) and spectroscopic lines (Lazarian & Pogosyan 2000; Kandel, Lazarian & Pogosyan 2016; Hu et al. 2023), to obtain the LOS and POS components of the magnetic field’s orientation and total magnetization simultaneously.

Importantly, the underlying theory of Hu, Lazarian & Xu (2021c)’s approach demonstrates that spectroscopic observations embody the anisotropy of MHD turbulence (Lazarian & Pogosyan 2000; Kandel, Lazarian & Pogosyan 2016; Hu et al. 2023), i.e. turbulent eddies elongate along the 3D direction of the magnetic field (Goldreich & Sridhar 1995; Lazarian & Vishniac 1999). The spatial features presented in these observations imprint the anisotropy and thus carry detailed information about the magnetic fields. This implies that, given an extensive amount of training data, machine learning algorithms have the potential to capture these features and produce accurate measurements. This strategy has been employed to map the 2D POS magnetic field orientation using velocity channel maps from spectroscopic observations (Xu, Law & Tan 2023). The theoretical basis remains the anisotropy of the MHD turbulence, a principle previously utilized to trace magnetic fields via velocity gradients (Hu, Yuen & Lazarian 2018; Lazarian & Yuen 2018; Alina et al. 2022; Liu, Hu & Lazarian 2022a; Hu et al. 2022c; Schmaltz, Hu & Lazarian 2023). However, Hu, Lazarian & Xu (2021c) made the crucial discovery that anisotropy in velocity channel maps harbours not only information about the POS magnetic field orientation, but also the total magnetization and the magnetic field’s inclination angle with respect to the LOS. This additional information paves the way for constructing the full 3D magnetic field vector from spectroscopic observations.

By leveraging the capabilities of Convolutional Neural Networks (CNN; LeCun et al. 1998) – a type of deep learning model excelling in image and signal processing – we aspire to develop a novel method that can probe the 3D magnetic field. Earlier studies of the CNN explored the possibility to distinguish sub-Alfvénic and super-Alfvénic turbulence (Peek & Burkhart 2019) and predict the POS magnetic field orientation (Xu, Law & Tan 2023). Our study, however, targets the simultaneous extraction of LOS and POS magnetic field orientations and the total magnetization. The foundation of our CNN model is the anisotropic MHD turbulence exhibited in spectroscopic observations (Lazarian & Pogosyan 2000; Kandel, Lazarian & Pogosyan 2016; Hu et al. 2023), a theoretical underpinning that allows us to interpret the CNN model accurately. In other words, it enables us to discern the specific features that convey information about the magnetic field, the reasons why they are informative, and their underlying physical meanings. The effectiveness of training a CNN is highly dependent on the availability of comprehensive numerical simulations that accurately represent realistic ISM. In this research, we employ 3D MHD supersonic simulations that portray a range of ISM environments, spanning sub-Alfvénic scenarios (i.e. strong magnetic field), trans-Alfvénic, and super-Alfvénic conditions (i.e. weak magnetic field). We further post-process these simulations by incorporating the radiative transfer effect, which enables us to generate mock emission lines of ¹³CO and C¹⁸O from diffuse molecular clouds. Through this trained CNN model, we present the 3D magnetic field map of the molecular cloud L1478.

This paper is organized as follows. In Section 2, we briefly review the basic concepts of MHD turbulence anisotropy in spectroscopic observations and their correlation with 3D magnetic field orientation and total magnetization. In Section 3, we give details of the 3D MHD simulations and mock observations used in this work, as well as our CNN model. We use mock observations to train the CNN model and present the results of numerical testing in Section 4. We further apply the trained CNN model to predict the 3D magnetic field in the molecular cloud L1478. In Section 5, we discuss the uncertainty and prospects of the machine learning approach, as well as implications for various astrophysical problems. We summarize our results in Section 6.

2 THEORETICAL CONSIDERATION

2.1 Anisotropy of MHD turbulence: revealing magnetic field orientation and magnetization

The earliest model of MHD turbulence was proposed to be isotropic (Iroshnikov 1963; Kraichnan 1965). However, this model underwent subsequent revisions through a series of theoretical and numerical studies, revealing that MHD turbulence exhibits anisotropy under sub-Alfvénic conditions and isotropy at large-scale, super-Alfvénic conditions (Montgomery & Turner 1981; Shebalin, Matthaeus & Montgomery 1983; Higdon 1984; Montgomery & Matthaeus 1995).

A significant advance in this field was the introduction of the ‘critical balance’ condition, i.e. equating the cascading time (k_⊥v_l)⁻¹ and the wave periods (k_∥v_A)⁻¹, proposed by Goldreich & Sridhar (1995), hereafter GS95. Here, k_∥ and k_⊥ represent the components of the wavevector parallel and perpendicular to the magnetic field, respectively, while v_l denotes the turbulent velocity at scale l, and |$v_{\rm A} = B/\sqrt{4\pi\rho}$| represents the Alfvén speed. Here, B is the magnetic field strength and ρ is the gas mass density.

Taking into account Kolmogorov-type turbulence, i.e. v_l ∝ l^1/3, the GS95 anisotropy scaling can be straightforwardly derived.

$$\begin{eqnarray} k_\parallel \propto k_\bot ^{2/3}, \end{eqnarray}$$

(1)

which reveals the anisotropic nature of turbulence eddies, implying that the eddies are elongated along the magnetic fields. However, it should be noted that the considerations of GS95 are based on a global reference frame, where the direction of the wavevectors is defined relative to the mean magnetic field.

Scale-dependent anisotropy was later introduced via the study of fast turbulent reconnection by Lazarian & Vishniac 1999 (hereafter LV99), which proposed a local reference frame. This frame is defined relative to the magnetic field passing through an eddy at scale l. According to LV99, the motion of eddies perpendicular to the direction of the local magnetic field adheres to the Kolmogorov law (i.e. |$v_{l,\bot }\propto l_\bot ^{1/3}$|⁠), since this is the direction in which the magnetic field offers minimal resistance. Applying the ‘critical balance’ condition in the local reference frame: |$v_{l,\bot}l_\bot ^{-1}\approx v_{\rm A} l_\parallel ^{-1}$|⁠, the scale-dependent anisotropy scaling is then given by:

$$\begin{eqnarray} l_\parallel = L_{\rm inj} \bigg(\frac{l_\bot }{L_{\rm inj}}\bigg)^{\frac{2}{3}} M_{\rm A}^{-4/3},{\quad}M_{\rm A}\le 1, \end{eqnarray}$$

(2)

where l_⊥ and l_∥ represent the perpendicular and parallel scales of eddies with respect to the local magnetic field, respectively. L_inj denotes the turbulence injection scale and M_A = v_inj/v_A is the Alfvén Mach number. |$M_{\rm A}^{-1}$| gives magnetization level of the medium.

Equation 2 provides two critical insights: (1) Turbulent eddies stretch along the local magnetic field (i.e. l_∥ ≫ l_⊥), and (2) the degree of anisotropy, defined as l_∥/l_⊥, depends on the magnetization |$M_{\rm A}^{-1}$|. As we illustrated in Fig. 1, this indicates that eddies become increasingly anisotropic in a strongly magnetized medium. For the case where M_A ≫ 1, turbulence is essentially isotropic due to the predominance of hydrodynamic turbulence. However, the essence of turbulence lies in the cascading of energy from larger injection scales to smaller ones, which leads to a decrease in turbulent velocity. Eventually, at the transition scale |$l_a = L_{\rm inj}/M_{\rm A}^3$|⁠, the strength of the magnetic field becomes comparable to the turbulence (i.e. the Alfvén Mach number at l_a is unity, see Lazarian 2006), and anisotropy starts to manifest.

$Illustration of how the observed intensity structures in channel map regulated by MA and γ. Within all three panels, these intensity structures elongate along the POS magnetic field direction where l∥ > l⊥. Structures 1 and 2, depicted in panels (a) and (b), are projected onto the POS with identical inclination angles γ1 = γ2, yet exhibit different magnetizations with $M_{\rm A,1}^{-1} > M_{\rm A,2}^{-1}$. Notably, the anisotropy observed, represented as l∥/l⊥, in the weakly magnetized Structure 2 is less pronounced than in Structure 1. Structure 2 is less straightened because the weak magnetic field has more fluctuations. The curvature of the observed magnetic structures is suggested for magnetization studies by Yuen & Lazarian (2020). Comparatively, Structures 1 and 3, showcased in panels (a) and (c), possess equivalent magnetizations $M_{\rm A,1}^{-1}= M_{\rm A,3}^{-1}$, but divergent inclination angles with γ1 > γ3. The observed anisotropy decreases with smaller γ, though it is crucial to note that the straightness of Structure 3 remains unaffected by this projection. It should be noted that, here, the projection effect is simplified. The intensity structures are predominantly created by the velocity caustics effect due to MHD turbulence. The projection effect is applied to the velocity field and then subsequent intensity structures in velocity channels.$

Figure 1.

Illustration of how the observed intensity structures in channel map regulated by M_A and γ. Within all three panels, these intensity structures elongate along the POS magnetic field direction where l_∥ > l_⊥. Structures 1 and 2, depicted in panels (a) and (b), are projected onto the POS with identical inclination angles γ₁ = γ₂, yet exhibit different magnetizations with |$M_{\rm A,1}^{-1} > M_{\rm A,2}^{-1}$|⁠. Notably, the anisotropy observed, represented as l_∥/l_⊥, in the weakly magnetized Structure 2 is less pronounced than in Structure 1. Structure 2 is less straightened because the weak magnetic field has more fluctuations. The curvature of the observed magnetic structures is suggested for magnetization studies by Yuen & Lazarian (2020). Comparatively, Structures 1 and 3, showcased in panels (a) and (c), possess equivalent magnetizations |$M_{\rm A,1}^{-1}= M_{\rm A,3}^{-1}$|⁠, but divergent inclination angles with γ₁ > γ₃. The observed anisotropy decreases with smaller γ, though it is crucial to note that the straightness of Structure 3 remains unaffected by this projection. It should be noted that, here, the projection effect is simplified. The intensity structures are predominantly created by the velocity caustics effect due to MHD turbulence. The projection effect is applied to the velocity field and then subsequent intensity structures in velocity channels.

Open in new tab Download slide

Furthermore, (3) changes in M_A are distinctly reflected in the magnetic field topology. Within a strongly magnetized medium, the magnetic field lines exhibit minimal variation due to the presence of weaker fluctuations, resulting in more straightened field lines. In contrast, in the context of a weaker magnetic field, which corresponds to a larger value of M_A, fluctuations in the magnetic field direction intensify significantly. This leads to the field lines adopting a more curved configuration (Yuen & Lazarian 2020). As turbulent eddies extend along the local magnetic field, the topological changes induced by M_A become evidently imprinted within these eddies.

2.2 Obtaining velocity information from spectroscopic observation

The anisotropy outlined in equation (2) pertains to turbulent velocity fluctuations, and the turbulent eddy refers to velocity fluctuation contour. This suggests that anisotropy manifests in turbulent velocity fields. Such anisotropic velocity can be obtained from the velocity channel map of spectroscopic observations, due to the velocity caustics effect (Lazarian & Pogosyan 2000). We briefly review this concept.

In position–position–velocity (PPV) space, the observed intensity distribution of a given spectral line is determined by both the density of emitters and their velocity distribution along the LOS. If coherent velocity shear – for instance, from galactic rotation – can be disregarded.¹, the LOS velocity component, v, becomes the sum of the turbulent velocity, v_tur(x, y, z), and the residual component attributable to thermal motions. This residual thermal velocity, v − v_tur(x, y, z), has a Maxwellian distribution, ϕ(v, x, y, z). For emissivity proportional to density, it provides PPV emission density ρ_s(x, y, z) as (Lazarian & Pogosyan 2004):

$$\begin{eqnarray} \rho_{\rm s}(x,y,v)=\kappa \int \rho (x,y,z)\,\phi (v,x,y,z)\,{\rm d}z, \end{eqnarray}$$

(3)

$$\begin{eqnarray} \phi (v,x,y,z) \equiv \frac{1}{\sqrt{2\pi c_{\rm s}^2}}\exp \bigg[-\frac{[v-v_{\rm tur}(x,y,z)]^2}{2c_{\rm s}^2}\bigg], \end{eqnarray}$$

(4)

where κ is a constant that correlates the number of emitters to the observed intensities. |$c_{\rm s} = \sqrt{\gamma_{\rm a} k_{\rm B}T/m}$| is the sound speed, with m being the mass of atoms or molecules, γ_a the adiabatic index, k_B being the Boltzmann constant, and T the temperature, which can vary from point to point if the emitter is not isothermal. However, the variation of temperature has only a marginal contribution to the distribution of ρ_s(x, y, v) (see Hu et al. 2023). By integrating ρ_s(x, y, v) over a defined velocity range or channel width Δv, we obtain a velocity channel:

$$\begin{eqnarray} p(x,y,v)=\int _{v-\Delta v/2}^{v+\Delta v/2}\rho_{\rm s}(x,y,v^\prime)\,{\rm d}v^\prime . \end{eqnarray}$$

(5)

By separating the 3D density into the mean density and zero-mean fluctuations, |$\rho (x,y,z) = \bar{\rho }+ \bar{\rho }\delta (x,y,z)$|⁠, the channel intensity can be represented as the sum of two terms, p(x, y, v) = p_vc(x, y, v) + p_dc(x, y, v)(Hu et al. 2023):

$$\begin{eqnarray} p_{\rm vc} \equiv \int _{v-\Delta v/2}^{v+\Delta v/2}\,{\rm d}v^\prime \int \bar{\rho} \phi (v^\prime ,x,y,z)\,{\rm d}z, \end{eqnarray}$$

(6)

$$\begin{eqnarray} p_{\rm dc} \equiv\int_{v-\Delta v/2}^{v+\Delta v/2}{\rm d}v^\prime \int \bar{\rho }\delta (x,y,z)\,\phi(v^\prime ,x,y,z)\,{\rm d}z. \end{eqnarray}$$

(7)

The first term, p_vc, encompasses the mean intensity in the channel and carries fluctuations exclusively produced by velocity, called the velocity caustics effect (Lazarian & Pogosyan 2000). The second term, p_dc, reflects the inhomogeneities in the real 3D density.

The relative importance of p_vc and p_dc depends on the channel width (Lazarian & Pogosyan 2000; Kandel, Lazarian & Pogosyan 2016; Hu et al. 2023). The narrower the channel width, the greater the contribution from p_vc. When the channel width Δv is less than the velocity dispersion |$\sqrt{\delta (v^2)}$| of the turbulent eddies under investigation, that is, |$\Delta v < \sqrt{\delta (v^2)}$|⁠, the intensity fluctuation in such a thin channel is predominantly due to velocity fluctuation. Consequently, p(x, y, v) inherits the anisotropy of MHD turbulence. The intensity structures within p(x, y, v) elongate along the POS magnetic fields, and their corresponding anisotropy degree, as well as the topology, is correlated with the magnetization and inclination angle. On the other hand, the dominance of p_vc ensures that the morphology of intensity fluctuation within p(x, y, v) is less sensitive to M_s, because the anisotropy in MHD turbulence’s velocity field is not affected by M_s (Kowal & Lazarian 2010).

It is important to note that Clark, Peek & Miville-Deschênes (2019) questioned the validity of velocity caustics in the presence of thermal broadening in multiphase H i gas and suggested that the thin velocity channel is dominated by density fluctuations from cold filaments. The nature of the striations in channel maps was tested in Hu et al. (2023), by explicitly evaluating velocity and density contributions in velocity channels obtained from multiphase H i simulations and GALFA-H i observations. This study confirmed that the velocity caustics were responsible for the observed striation.

2.3 Anisotropy in thin velocity channels: dependence on the inclination angle of magnetic fields

The anisotropy of the observed intensity in a PPV channel, represented by p(x, y, v), is also affected by the inclination angle γ of the magnetic field with respect to the LOS, due to the projection effect (Hu, Lazarian & Xu 2021c). For example, as illustrated in Fig. 1, we consider two magnetized structures (or eddies), s₁, and s₃, both having identical magnetization. Although these unprojected structures have the same anisotropy degree, their projections differ. Specifically, a projection with a smaller inclination angle results in a lower anisotropy degree by reducing the scale parallel to the magnetic fields. When γ = 0, the parallel scale of the eddy aligns with the LOS, making the anisotropy unobservable on the POS.

However, as previously mentioned, the degree of anisotropy is also controlled by magnetization. As shown in Fig. 1, although two magnetized structures (s₁ and s₂) share identical inclination angles, the projection of the weakly magnetized s₂ shows less anisotropy. Importantly, the topology of s₂ is further changed being less straightened. This is because a weak magnetic field has more deviations and exhibits significant curvature in terms of its POS orientation (Yuen & Lazarian 2020). Consequently, the observed structure, as well as the structure’s topology, in p(x, y, v) is governed by both |$M_{\rm A}$| and γ (Hu, Lazarian & Xu 2021c).

To summarize succinctly, the thin channel maps p(x, y, v) from spectroscopic observations capture the anisotropy of MHD turbulence. This leads to the following important implications:

The intensity structures in p(x, y, v) align with the POS magnetic field.
The degree of anisotropy observed in these intensity structures is influenced by two distinct factors: |$M_{\rm A}$| and γ. These factors contribute to the anisotropy: (a) γ introduces a projection effect that consequently decreases the anisotropy. (b) |$M_{\rm A}$| defines the magnetization level of the medium. A larger |$M_{\rm A}$| represents a weaker magnetic field, resulting in less pronounced anisotropy.
Additionally, changes in |$M_{\rm A}$| alter the topology of the magnetic field lines, as well as the observed intensity structure, manifesting itself as significant curvature.

The interconnection between magnetic field topology and |$M_{\rm A}$| is vital to extracting accurate 3D magnetic fields. A subtle change in the degree of anisotropy responds sensitively to variations in both |$M_{\rm A}$| and γ, leading to a degeneracy. This degeneracy necessitates the introduction of an additional feature that is sensitive to |$M_{\rm A}$| or γ to solve for these parameters, and the topology of the magnetic field conveniently provides this required information.

Additionally, it is crucial to acknowledge that relying solely on anisotropy does not offer a clear distinction regarding the magnetic field’s orientation along the LOS, specifically whether the field is directed towards or away from our observation point. Consequently, the value of γ is inherently restricted to a limited range between 0 and 90°.

3 NUMERICAL METHOD

3.1 Convolutional neural network (CNN)

To construct a deep neural network for the purpose of tracing the 3D magnetic field from a spectroscopic map, we adopt a CNN-based (LeCun et al. 1998) architecture. CNNs have demonstrated significant success in processing multidimensional data. The typical CNN architecture, as illustrated in Fig. 2, consists of initial layers comprising a stack of convolutional layers followed by pooling layers. To facilitate faster convergence during the network training process using backpropagation of the loss and enhance the stability of learning, we introduce a batch normalization layer following each convolution layer. After several iterations of convolution and pooling layers, we extract a compressed image feature, which is then processed by the fully connected layers to predict the desired properties. In the following, we introduce the core modules in the CNN architecture as well as the training procedure for the CNN network.

Figure 2.

Architecture of the CNN-model. The input image is a 22 × 22 pixel map cropped from the thin velocity channel map. The network outputs the prediction of ϕ, γ, or M_A.

Open in new tab Download slide

3.2 Convolutional layer

Serving as the fundamental component of a CNN, the convolutional layer processes input data to produce feature maps (LeCun et al. 1989). In this layer, each neuron connects to a local region of the input feature map. This connection is achieved by applying a 2D convolutional kernel w_l to the input feature map. This process can be mathematically described as follows:

$$\begin{eqnarray} a_{l} = \sigma (w_{l} * h_{l-1}+ b_{l}), \end{eqnarray}$$

(8)

where h_{l − 1} and a_l are the input and output feature map for the l-th convolutional layer, respectively, and w_l is the learnable convolution kernel, and * indicates the convolution operation. In addition, a learnable bias b_l is applied to the input feature map. To be more concrete,

$$\begin{eqnarray} a_{l}(x,y) = \sigma \left(\sum _{i=-k}^{k} \sum _{j=-k}^{k} w_{l}(i, j)h_{l-1}(x-i,y-j) {+} b_{l}(x,y)\right). \end{eqnarray}$$

(9)

By applying the 2D convolution kernel |$w_l \in \mathbb {R}^{(2k+1) \times (2k+1)}$| to the input feature map |$h_{l-1} \in \mathbb {R}^{d^{\rm in} \times d^{\rm in}}$|⁠, we yield the output feature map with size (dⁱⁿ − k − 1) × (dⁱⁿ − k − 1). Here, dⁱⁿ denotes the size of the input feature map and 2k + 1 is the size of the convolution kernel. The resulting locally-weighted sum, once added to the learned bias, undergoes a non-linear transformation via the ReLU activation function σ(·).

To constrain the number of parameters that need to be learned in our network, we generally use small kernel sizes. While each layer has a limited receptive field focusing on local features through the utilization of small convolutional kernels, stacking multiple layers allows for the gradual expansion of this receptive field. Consequently, the network becomes capable of capturing global features within the image as the depth increases.

3.3 Batch normalization layer

It is a technique frequently utilized in neural networks, playing a pivotal role in stabilizing them and hastening the convergence of the training loss during the backpropagation process (Ioffe & Szegedy 2015). During each training iteration, it functions on a mini-batch of data. The layer normalizes each feature within the input data by centring its values around the mean and scaling based on the feature’s standard deviation within the given batch. This normalization process is instrumental in mitigating the internal covariate shift – a phenomenon where the distribution of inputs at each layer undergoes changes during training – facilitating a more stable and efficient training process.

Following the normalization, batch normalization introduces two learnable parameters per feature: a scaling parameter and a shifting parameter. These parameters allow the network to learn the optimal scale and shift for the normalized values autonomously, providing the model with the flexibility to modify the normalization if it learns that such reversal or adjustment is beneficial for its predictive performance. These dynamic adjustments, enabled by the introduced parameters, imbue the network with a degree of adaptability, allowing it to fine-tune the transformations applied to the features as needed during the training.

3.4 Pooling layer

Following the detection of local features in the input feature maps by the convolution layer, a pooling layer is typically employed to merge similar local features into a singular feature (Sermanet et al. 2013). One common variant of the pooling layer is the Max Pooling Layer. This layer works by calculating the maximum value within a local patch of neurons and then outputting this maximum value as a single neuron. Importantly, the patches of input neurons for adjacent pooling units are shifted by more than one row or column, which effectively reduces the dimensionality of the feature representation. This process imparts the network with a degree of invariance to minor shifts and distortions in the input data, as it condenses the information in the feature maps while retaining the most salient features. This reduction not only helps in making the detection of features invariant to scale and orientation changes but also enhances computational efficiency by reducing the number of parameters and computations in the network.

3.5 Fully connected layer

After sequential operations that involve multiple convolutional layers and aggregation, the network derives a lower-dimensional compressed image feature map. Subsequently, this 2D feature map undergoes a transformation, being flattened into a 1D vector. The fully connected layer then processes this vector (Goodfellow, Bengio & Courville 2016). The role of the layer is critical, as it integrates the high-level reasoning of the features extracted and flattened previously. The mechanism involves applying learned weights and biases to this flattened vector to predict the final output. Mathematically, this operation can be represented as:

$$\begin{eqnarray} \pmb {y} = \sigma (\pmb{W}\pmb{h}+\pmb{b}), \end{eqnarray}$$

(10)

In this equation, |$\pmb{h} \in \mathbb{R}^{d_{\rm in}}$| represents the flattened, compressed image feature vector, and |$\pmb{y} \in \mathbb {R}^{d_{\rm out}}$| symbolizes the predicted result. Here, |$\pmb{W} \in \mathbb {R}^{d_{\rm out} \times d_{\rm in}}$| and |$\pmb{b} \in \mathbb {R}^{d_{\rm out}}$| denote the learnable weights and biases for the fully connected layer, respectively. d^out represents the size of the output feature map. These weights and biases are integral to the layer’s functionality, providing the means for it to learn and adapt during the training phase, ultimately allowing for the accurate prediction of the desired output from the input images.

3.6 Network training

The trainable parameters within the CNN are optimized by adhering to a conventional neural network training methodology, where the mean-squared error of the 3D magnetic field prediction serves as the training loss for backpropagation, as outlined in the seminal work by Rumelhart, Hinton & Williams (1986). During the training process, we implement a strategy designed to enrich the diversity of the training data set and consequently enhance the generalization capabilities of the deep neural network. Specifically, this involves augmenting the input images by subjecting them to random cropping operations, resulting in smaller patches of size 22 × 22 cells. Such augmentation introduces variability and randomness into the training data, which is instrumental in refining the network’s ability to generalize from the training data to unseen data, thereby bolstering its predictive accuracy and robustness. In total, we generated ≈1.7 × 10⁷ input 22 × 22-cell maps, with 20 per cent of them serving as a validation set, for each molecular species.

3.7 MHD simulations

The numerical simulations used in this study were executed using the ZEUS-MP/3D code (Hayes et al. 2006). We performed an isothermal simulation of a 10 pc cloud by solving the ideal MHD equations in an Eulerian frame under periodic boundary conditions:

$$\begin{eqnarray} && \partial \rho /\partial t +\nabla \cdot (\rho \pmb {v})=0,\\&&\partial (\rho \pmb {v})/\partial t+\nabla \cdot \left[\rho \pmb {v}\pmb {v}^T+\bigg(c_{\rm s}^2\rho _+\frac{B^2}{8\pi}\bigg)\pmb {I}-\frac{\pmb {B}\pmb {B}^T}{4\pi }\right]=\pmb {f},\\&& \partial \pmb {B}/\partial t-\nabla \times (\pmb {v}\times \pmb {B})=0,\\&&\nabla \cdot \pmb {B}=0, \end{eqnarray}$$

(11)

where |$\pmb {f}$| represents the stochastic forcing term used to drive turbulence. ρ, |$\pmb {v}$|⁠, and |$\pmb {B}$| are mass density, velocity, and magnetic field, respectively. Given the isothermal equation of state, the sound speed c_s was held constant at approximately 187 m s⁻¹, corresponding to a gas temperature of 10 K. Purely turbulent scenarios were also considered, excluding the impact of self-gravity. Kinetic energy was solenoidally (i.e. the forcing term is divergence-free) injected at the wavenumber k = 2π/l ≈ 2 (in the unit of 2π/L_box, where L_box is the length of simulation box) in Fourier space, where l is the length-scale in real space, producing a Kolmogorov-like power spectrum. Turbulence was continuously stimulated until it reached a state of statistical saturation. The simulation was solved on a regular grid of 792³ cells and the turbulence was numerically dissipated at scales of approximately 10–20 cells.

The simulations were initialized with a uniform density field and a magnetic field, with the initial mean magnetic field oriented along the y-axis. Furthermore, we rotated the simulation cubes so that the mean angle of inclination with respect to the LOS (or z-axis) reached 90°, 60°, and 30°. The sonic Mach number, M_s = v_inj/c_s, and the Alfvénic Mach number, M_A = v_inj/v_A, characterize MHD turbulence simulations. To model different ISM conditions, we used a typical mean number density of 300 cm⁻³ and varied the initial uniform magnetic field and the injected kinetic energy to obtain a range of M_A and M_s values. In this paper, we refer to the simulations in Table 1 by their model name or key parameters.

Table 1.

Open in new tab

M_s and M_A are the sonic Mach number and the Alfvénic Mach number calculated from the global injection velocity, respectively. M|$_{\rm A}^{\rm sub}$| and M|$_{\rm s}^{\rm sub}$| are determined using the local velocity dispersion calculated along each LOS in a 22 × 22 cell subfield. The expressions ‘min{…}’ and ‘max{…}’ denote the minimum and maximum value averaged over each 22 × 22 cell subfield within the corresponding simulation.

Run	M_s	M_A	min{M\|$_{\rm A}^{\rm sub}$\|}	max{M\|$_{\rm A}^{\rm sub}$\|}	min{M\|$_{\rm s}^{\rm sub}$\|}	max{M\|$_{\rm s}^{\rm sub}$\|}
A0	5.33	0.20	0.03	0.28	2.97	7.84
A1	5.38	0.41	0.10	0.81	2.90	7.24
A2	5.40	0.61	0.21	1.00	3.15	7.33
A3	5.20	0.79	0.29	1.37	3.10	6.55
A4	5.23	0.95	0.30	1.99	3.00	7.18
A5	5.12	1.13	0.32	2.49	3.17	6.80
A6	5.38	1.09	0.41	3.37	3.13	6.96
A7	5.23	1.39	0.40	4.13	3.19	7.41
A8	5.16	1.46	0.39	4.94	3.21	6.76
A9	5.08	1.43	0.48	6.06	2.87	7.10

Run	M_s	M_A	min{M\|$_{\rm A}^{\rm sub}$\|}	max{M\|$_{\rm A}^{\rm sub}$\|}	min{M\|$_{\rm s}^{\rm sub}$\|}	max{M\|$_{\rm s}^{\rm sub}$\|}
A0	5.33	0.20	0.03	0.28	2.97	7.84
A1	5.38	0.41	0.10	0.81	2.90	7.24
A2	5.40	0.61	0.21	1.00	3.15	7.33
A3	5.20	0.79	0.29	1.37	3.10	6.55
A4	5.23	0.95	0.30	1.99	3.00	7.18
A5	5.12	1.13	0.32	2.49	3.17	6.80
A6	5.38	1.09	0.41	3.37	3.13	6.96
A7	5.23	1.39	0.40	4.13	3.19	7.41
A8	5.16	1.46	0.39	4.94	3.21	6.76
A9	5.08	1.43	0.48	6.06	2.87	7.10

Table 1.

Open in new tab

Run	M_s	M_A	min{M\|$_{\rm A}^{\rm sub}$\|}	max{M\|$_{\rm A}^{\rm sub}$\|}	min{M\|$_{\rm s}^{\rm sub}$\|}	max{M\|$_{\rm s}^{\rm sub}$\|}
A0	5.33	0.20	0.03	0.28	2.97	7.84
A1	5.38	0.41	0.10	0.81	2.90	7.24
A2	5.40	0.61	0.21	1.00	3.15	7.33
A3	5.20	0.79	0.29	1.37	3.10	6.55
A4	5.23	0.95	0.30	1.99	3.00	7.18
A5	5.12	1.13	0.32	2.49	3.17	6.80
A6	5.38	1.09	0.41	3.37	3.13	6.96
A7	5.23	1.39	0.40	4.13	3.19	7.41
A8	5.16	1.46	0.39	4.94	3.21	6.76
A9	5.08	1.43	0.48	6.06	2.87	7.10

Run	M_s	M_A	min{M\|$_{\rm A}^{\rm sub}$\|}	max{M\|$_{\rm A}^{\rm sub}$\|}	min{M\|$_{\rm s}^{\rm sub}$\|}	max{M\|$_{\rm s}^{\rm sub}$\|}
A0	5.33	0.20	0.03	0.28	2.97	7.84
A1	5.38	0.41	0.10	0.81	2.90	7.24
A2	5.40	0.61	0.21	1.00	3.15	7.33
A3	5.20	0.79	0.29	1.37	3.10	6.55
A4	5.23	0.95	0.30	1.99	3.00	7.18
A5	5.12	1.13	0.32	2.49	3.17	6.80
A6	5.38	1.09	0.41	3.37	3.13	6.96
A7	5.23	1.39	0.40	4.13	3.19	7.41
A8	5.16	1.46	0.39	4.94	3.21	6.76
A9	5.08	1.43	0.48	6.06	2.87	7.10

3.8 Emission lines of ¹³CO and C¹⁸O

We generate synthetic emission lines for two CO isotopologues: ¹³CO (1–0) and C¹⁸O (1–0), following the procedures used in Hu & Lazarian (2021). This was achieved using the SPARX radiative transfer code (Hsieh et al. 2019). SPARX solves the radiative transfer equation (RTE) for finite cells, which means that it considers the emission from a homogeneous finite element. The equation of statistical equilibrium for molecular levels takes into account molecular self-emission, stimulated emission, and collisions with gas particles. Information on the distribution of molecular gas density with mean density ∼300 cm⁻³ and LOS velocity was extracted from the MHD simulations mentioned above.

The fractional abundances of the CO isotopologues ¹³CO(1–0) and C¹⁸O(1–0) were set at 2 × 10⁻⁶ and 1.7 × 10⁻⁷, respectively. We derive the ¹²CO-to-H₂ ratio of 1 × 10⁻⁴ from the cosmic value of C/H = 3 × 10⁻⁴ and the assumption that 15 per cent of C is in molecular form. The abundance of ¹³CO is determined using a ¹³CO/¹²CO ratio of 1/69, as indicated by Wilson (1999), giving a ¹³CO/H₂ ratio of approximately 2 × 10⁻⁶. Using a ¹²CO/C¹⁸O ratio of 500, as given by Wilson, Rohlfs & Hüttemeister (2013), we obtained a C¹⁸O-to-H₂ ratio of 1.7 × 10⁻⁷. When generating these synthetic emission lines, we specifically focused on the lowest-transition J = 1–0 of the CO isotopologues, with the Local Thermodynamic Equilibrium (LTE) satisfied.

3.9 Training images

Our training input is a thin velocity channel map, p(x, y, v₀), derived from either the ¹³CO (1–0) or C¹⁸O (1–0) line, calculated from:

$$\begin{eqnarray} p(x,y,v_0)=\int _{v_0-\Delta v/2}^{v_0+\Delta v/2}T_{\rm e}(x,y,v)\,{\rm d}v, \end{eqnarray}$$

(12)

where v₀ is the velocity associated with the line’s central peak, T_e is the emission line’s intensity, and |$\Delta v=\sqrt{\delta (v^2)}$|⁠. Here, |$\sqrt{\delta (v^2)}$| is the velocity dispersion derived from the moment-1 map (velocity centroid map). The ¹²CO line, a common diffuse cloud tracer, is not used in this work due to numerical limitations related to the saturation of the intensity of ¹²CO in the channel centring at v₀, which obliterates the spatial features of that channel (Hsieh et al. 2019). However, the CNN method could be extended to include wing channels centring at |v| < v₀ to bypass this numerical saturation, a possibility we might explore in future work.²

We generate p(x, y, v₀) for the full cloud, a region of 792 × 792 cells, then randomly segment p(x, y, v₀) into 22 × 22-cell subfields for input into the CNN model. The choice of 22 × 22-cell avoids that the features fall into the numerical dissipation range, in which the anisotropy of MHD turbulence is distorted by numerical diffusivity. In observation, the inertial range of MHD turbulence is much longer, and the velocity channel map is not affected by the dissipation. The size of the subfield, thus, could be smaller to achieve higher resolution. For each subfield, we also generate corresponding projected maps of ϕ^sub, γ^sub, M|$_{\rm A}^{\rm sub}$|⁠, and M|$_{\rm s}^{\rm sub}$| as per the following:

$$\begin{eqnarray} && \phi^{\rm sub}(x,y) = \arctan \left(\frac{\int B_y(x,y,z)\,{\rm d}z}{\int B_x(x,y,z)\,{\rm d}z}\right),\\ && \gamma^{\rm sub}(x,y) = \arccos \left(\frac{\int B_z(x,y,z)\,{\rm d}z}{\int B(x,y,z)\,{\rm d}z}\right),\\ && M_{\rm A}^{\rm sub} =\frac{v_{\rm inj}^{\rm los}\sqrt{4\pi \langle\rho\rangle _{\rm los}}} {\langle B\rangle _{\rm los}}, \\ && M_{\rm s}^{\rm sub} = \frac{v_{\rm inj}^{\rm los}}{c_{\rm s}}, \end{eqnarray}$$

(13)

where |$B=\sqrt{B_x^2+B_y^2+B_z^2}$| is the total magnetic field strength, and B_x, B_y, and B_z are its x, y, and z components. 〈ρ〉_los and 〈B〉_los are the gas mass density and magnetic field strength averaged along the LOS. M|$_{\rm A}^{\rm sub}$| and M|$_{\rm s}^{\rm sub}$| are defined using the local velocity dispersion for each LOS (i.e. |$v_{\rm inj}^{\rm los}$|⁠), rather than the global turbulent injection velocity v_inj used to characterize the full simulation. The ranges of M|$_{\rm A}^{\rm sub}$| and M|$_{\rm s}^{\rm sub}$| averaged over the subfield in each simulation with different γ are listed in Table 1, while γ^sub spans from 0 to 90°. These values of M|$_{\rm A}^{\rm sub}$|⁠, M|$_{\rm s}^{\rm sub}$|⁠, and γ^sub cover typical physical conditions of diffuse molecular clouds (Hu & Lazarian 2023c).

4 RESULTS

4.1 Numerical training and tests

Fig. 3 provides a visualization detailing the influence of M_A and γ on the anisotropy of intensity structures within thin velocity channels. In scenarios where both M_A and γ values are small, the intensity structures distinctly manifest as slender strips, extending in alignment with the POS magnetic fields. These structures are produced predominantly by the turbulent velocity (Lazarian & Pogosyan 2000), as demonstrated in Hu et al. (2023). As M_A increases, representing a weakening in the magnetic field, the MHD turbulence begins to more closely resemble isotropic hydrodynamical turbulence. This shift brings about a marked change in the topology of intensity structures, making them less anisotropic. Alternatively, when dealing with smaller values of γ, which imply that magnetic fields are oriented more proximally to the LOS, the inherent anisotropy is subdued due to the projection effect. Comparing ¹³CO and C¹⁸O, C¹⁸O is more sensitive to denser gas, so its associated intensity structures exhibit distinct characteristics. Despite these differences, the underlying physical principle of anisotropic MHD turbulence remains the same, suggesting M_A and γ continue to shape the observed structural formations.

An numerical illustration of the anisotropy in 13CO (top) and C18O channel map. The red streamlines represent the POS magnetic field orientation. Panel (a): MA = 0.20, γ = 90°. Panel (b): MA = 1.43, γ = 90°. Panel (c): MA = 0.20, γ = 60°.

Figure 3.

An numerical illustration of the anisotropy in ¹³CO (top) and C¹⁸O channel map. The red streamlines represent the POS magnetic field orientation. Panel (a): M_A = 0.20, γ = 90°. Panel (b): M_A = 1.43, γ = 90°. Panel (c): M_A = 0.20, γ = 60°.

Open in new tab Download slide

Fig. 4 provides a comparative visualization between the actual 3D magnetic fields and those predicted through the utilization of the trained CNN model with ¹³CO. This comparison is framed within two distinct conditions: sub-Alfvénic (simulation with 〈M_A〉 ≈ 0.5 and 〈γ〉 ≈ 90°) and super-Alfvénic (simulation with 〈M_A〉 ≈ 2.0 and 〈γ〉 ≈ 30°). Within these settings, the mean projected total Alfvén Mach number on the POS is given as 〈M_A〉 ≈ 0.5 for sub-Alfvénic conditions and 〈M_A〉 ≈ 2.0 for super-Alfvénic ones.

An comparison of the CNN-predicted 3D magnetic fields using 13CO in sub-Alfvén (top, 〈MA〉 ≈ 0.5 and 〈γ〉 ≈ 90°) and super-Alfvén (bottom, 〈MA〉 ≈ 2.0 and 〈γ〉 ≈ 30°) conditions. Each magnetic field segment is constructed by the POS magnetic field’s position angle (i.e. ϕ) and the inclination angle γ. Note that the magnetic field obtained is the projection along the LOS and averaged over 132 × 132 pixels for visualization purposes. The third axis of the LOS is for 3D visualization purposes and does not provide distance information here. The total intensity map I is placed on the POS, i.e. the x–y plane.

Figure 4.

An comparison of the CNN-predicted 3D magnetic fields using ¹³CO in sub-Alfvén (top, 〈M_A〉 ≈ 0.5 and 〈γ〉 ≈ 90°) and super-Alfvén (bottom, 〈M_A〉 ≈ 2.0 and 〈γ〉 ≈ 30°) conditions. Each magnetic field segment is constructed by the POS magnetic field’s position angle (i.e. ϕ) and the inclination angle γ. Note that the magnetic field obtained is the projection along the LOS and averaged over 132 × 132 pixels for visualization purposes. The third axis of the LOS is for 3D visualization purposes and does not provide distance information here. The total intensity map I is placed on the POS, i.e. the x–y plane.

Open in new tab Download slide

The visual segment displayed in Fig. 4 is constructed from the POS magnetic field’s position angle, ϕ, and the inclination angle, γ, with a superimposed colour representation signifying the projected M_A. Upon comparison with the intrinsic magnetic field embedded within the simulation, a noteworthy observation is the alignment between the orientations of the CNN-predicted 3D magnetic field and the actual field, evident under both sub-Alfvénic and super-Alfvénic conditions. In the sub-Alfvénic case, the CNN-predicted M_A is slightly larger (by ≈0.1–0.2) than the actual values. Conversely, in the super-Alfvénic scenario, the predicted value is somewhat smaller, with a deviation ranging from ≈0.5–1.0. Another example with 〈M_A〉 ≈ 0.15 and 〈γ〉 ≈ 60° is presented in Appendix A. Although this simulation shows an anisotropy degree similar to the case with 〈M_A〉 ≈ 0.5 and 〈γ〉 ≈ 90°, the CNN model effectively resolves the degeneracy in the correlation of the anisotropy degree with γ and M_A (see Section 2), successfully recovering the 3D magnetic field (see Fig. A1). It should be noted that the predicted M_A is still overestimated by approximately 0.1–0.2.

Fig. 5 offers a similar visual comparison but focuses on the C¹⁸O line. This line is generally recognized as denser tracers compared to ¹³CO. Despite these differences in tracer density, the CNN predictions for C¹⁸O lines maintain a general alignment with the actual 3D magnetic fields observed within the simulations. Moreover, there is less significant overestimation and underestimation in the CNN-predicted M_A.

Figure 5.

Same as Fig. 4, but for C¹⁸O.

Open in new tab Download slide

Figs 6 and 7 present 2D histograms illustrating the correspondence between CNN predictions – ϕ^CNN, γ^CNN, and M|$_{\rm A}^{\rm CNN}$| – and actual values obtained from two test simulations, A2 and A6. In sub-Alfvénic cases for both ¹³CO and C¹⁸O molecules, we observe a close alignment between the CNN predictions and the real values. The scatter of the predictions, which includes ϕ^CNN, γ^CNN, and M|$_{\rm A}^{\rm CNN}$|⁠, demonstrates a small deviation from the actual values, tightly congregating near the one-to-one reference line. This minimal deviation suggests that the CNN model offers a high degree of accuracy and reliability when operating under sub-Alfvéénic conditions.

$2D histogram of the 13CO CNN-predictions, i.e. ϕCNN (left), γCNN (middle), and M$_{\rm A}^{\rm CNN}$ (right), and the corresponding actual values in simulation (Top: sub-Alfvén, 〈MA〉 ≈ 0.5 and 〈γ〉 ≈ 90°. Bottom: super-Alfvén, 〈MA〉 ≈ 2.0 and 〈γ〉 ≈ 30°). The dashed reference line represents the ideal scenario, where the predicted values and actual values match perfectly.$

Figure 6.

2D histogram of the ¹³CO CNN-predictions, i.e. ϕ^CNN (left), γ^CNN (middle), and M|$_{\rm A}^{\rm CNN}$| (right), and the corresponding actual values in simulation (Top: sub-Alfvén, 〈M_A〉 ≈ 0.5 and 〈γ〉 ≈ 90°. Bottom: super-Alfvén, 〈M_A〉 ≈ 2.0 and 〈γ〉 ≈ 30°). The dashed reference line represents the ideal scenario, where the predicted values and actual values match perfectly.

Open in new tab Download slide

Figure 7.

Same as Fig. 6, but for C¹⁸O.

Open in new tab Download slide

However, the scenario is a bit different in super-Alfvénic cases. Here, the scatter is noticeably more widespread, indicating that deviations from the real values increase in these conditions. The ϕ^CNN predictions, in particular, show a tendency for both overestimation and underestimation. In contrast, the γ^CNN predictions are primarily characterized by overestimations, a trend that is especially prominent in cases involving C¹⁸O molecules. Meanwhile, the scatter related to the M|$_{\rm A}^{\rm CNN}$| predictions is distributed more uniformly around the reference line.

This suggests predicting the 3D magnetic field under super-Alfvénic conditions is more challenging with higher uncertainty. In these environments, the magnetic field exerts a weaker influence, and the turbulence observed more closely resembles that of hydrodynamic turbulence, thereby complicating the prediction process. Enhancing prediction accuracy is feasible through two strategies. First, it is possible to further refine and optimize the CNN model to improve its adaptability and responsiveness to the unique features of super-Alfvénic MHD turbulence. For instance, Peek & Burkhart (2019) put forth a CNN model designed specifically to differentiate between sub-Alfvénic and super-Alfvénic turbulence. This model, with its specialized focus, offers a promising avenue for enhancing the accuracy of predictions in super-Alfvénic environments. Second, enrich the data set to train the CNN model. By incorporating a broader and more diverse range of images, the model can be exposed to a wider array of scenarios and conditions, thereby reducing uncertainty and improving its ability to make accurate predictions across different environments and conditions.

Figs 8 and 9 plot the histograms of the deviation between the CNN-predicted and the actual 3D magnetic field. We calculate the absolute difference between ϕ^CNN and ϕ, between γ^CNN and γ, and between M|$_{\rm A}^{\rm CNN}$| and M_A, respectively. These differences are denoted as σ_ϕ, σ_γ, and |$\sigma _{{\rm M}_{\rm A}}$|⁠. In the sub-Alfvénic scenarios, we observed that the distributions of σ_ϕ and σ_γ are relatively condensed, primarily falling within the 0 to 20° range. This concentration indicates a close alignment between the CNN predictions and the actual values in sub-Alfvénic environments, suggesting that the CNN model performs with high precision in these conditions. However, as 〈M_A〉 increases, the distributions of σ_ϕ and σ_γ broaden, spanning a more extensive range from 0 to 60°. This dispersion is indicative of larger deviations between predicted and actual values under these conditions, implying that the CNN model may face challenges in accurately capturing the magnetic field dynamics when 〈M_A〉 increases.

$Histograms of difference in CNN-predicted ϕCNN (left), γCNN (middle), and M$_{\rm A}^{\rm CNN}$ (right), and the actual values in simulations using 13CO.$

Figure 8.

Histograms of difference in CNN-predicted ϕ^CNN (left), γ^CNN (middle), and M|$_{\rm A}^{\rm CNN}$| (right), and the actual values in simulations using ¹³CO.

Open in new tab Download slide

Figure 9.

Same as Fig. 8, but for C¹⁸O.

Open in new tab Download slide

Examining specific molecules, for ¹³CO under sub-Alfvénic conditions, the median deviation values are relatively low: σ_ϕ = 3.26°, σ_γ = 2.98°, and |$\sigma _{{\rm M}_{\rm A}}=0.16$|⁠. In contrast, under super-Alfvénic conditions, these values increase to 12.32°, 9.08°, and 1.1, respectively, highlighting an increase in prediction deviation as the environment transitions from sub- to super-Alfvénic. Similarly, for C¹⁸O, the median deviation values are 2.22°, 3.20°, and 0.16 under sub-Alfvénic conditions and 12.08°, 13.60°, and 1.36 under super-Alfvénic scenarios, underlining a consistent trend of increased deviation in super-Alfvénic environments across different molecules.

4.2 Observational prediction

For the observational tests, our target is the nearby L1478 cloud. We utilized ¹³CO spectral line from a previous study Lewis et al. (2021). The data has a beam resolution of 38 arcmin and was regrid to a pixel resolution of 10 arcsec, while achieving a velocity resolution of 0.3 km s⁻¹. The 1D velocity dispersion σ_v of the ¹³CO line was reported within the range of 0.40–0.70 km s⁻¹ (Lewis et al. 2021). Assuming an isotropic velocity dispersion in 3D and uniform temperature of 10 K (corresponding to an isothermal sound speed of c_s ∼ 0.187 km s⁻¹, see Hu, Lazarian & Stanimirović 2021b), we find the sonic Mach number M|$_{\rm s}=\sqrt{3}\sigma _v/c_{\rm s}$| ranges from 3.69 to 6.45, falling into the parameter regimes in our numerical simulations. With these refined data, we applied our adeptly trained CNN model to the ¹³CO channel map, aiming to predict the key 3D magnetic field parameters, denoted as |$\phi ^{\rm CNN}, \gamma ^{\rm CNN}, M_{\rm A}^{\rm CNN}$|⁠.

For the purpose of validating the results yielded through our CNN application, we engaged in a comparative analysis with POS magnetic field orientations as predicted through Planck 353 GHz polarization data. The data harnessed for this comparative process was drawn from the third Public Data Release (DR3), provided by Planck’s High-Frequency Instrument (Planck Collaboration et al. 2020a). The POS magnetic field orientation was inferred from Stokes parameters Q and U converted to IAU convention from HEALPix using the equation: |$\phi^{\rm Planck} = \frac{1}{2}\tan^{-1}(-U,Q) + \pi /2$|⁠. To enhance the signal-to-noise ratio, we smoothed the Stokes parameter maps from an angular resolution of 5 to 10 arcmin using a Gaussian kernel.

As presented in Fig. 10, a remarkable alignment between the magnetic field orientations as predicted by both the CNN model and the Planck polarization data is observed, while we notice the difference is apparent in the north-east clump (see the zoom-in plot in Fig. 10). To quantify the agreement between CNN-prediction and polarization, we utilize the Alignment Measure (AM; González-Casanova & Lazarian 2017), expressed as:

$$\begin{eqnarray} {\rm AM} = \langle \cos (2\theta _{\rm r})\rangle , \end{eqnarray}$$

(14)

here θ_r is the relative angle between the two measurements. An AM value of ≈0.94 confirms the CNN-prediction has an excellent agreement with Planck polarization³, corresponding to an overall deviation of ≈10°.

Comparison of the POS magnetic fields predicted by CNN-13CO (red segment) for the L1478 cloud and inferred from Planck polarization (blue segment). The background image is the integrated 13CO intensity map.

Figure 10.

Comparison of the POS magnetic fields predicted by CNN-¹³CO (red segment) for the L1478 cloud and inferred from Planck polarization (blue segment). The background image is the integrated ¹³CO intensity map.

Open in new tab Download slide

A noteworthy advantage of our CNN model over traditional polarization methodologies is its ability to trace the 3D magnetic fields. This is achieved through the model’s predictions regarding γ and M_A. These predictions are summarized in histograms within Fig. 11. According to the histograms, the median γ and M_A of the L1478 cloud are estimated at ≈76° and ≈1.07, respectively. These measurements suggest that the L1478 is a trans-Alfvénic could. In this state, there is an equilibrium between magnetic and turbulent kinetic energies within the cloud. The parameters derived from the CNN application have been instrumental in creating the first-ever 3D magnetic field map for L1478, which can be viewed in Fig. 12.

$Histograms of CNN-predicted (as well as Planck measured) ϕCNN (left), defined east from the north, γCNN (middle), and M$_{\rm A}^{\rm CNN}$ (right).$

Figure 11.

Histograms of CNN-predicted (as well as Planck measured) ϕ^CNN (left), defined east from the north, γ^CNN (middle), and M|$_{\rm A}^{\rm CNN}$| (right).

Open in new tab Download slide

An visulization of the CNN-predicted 3D magnetic fields using 13CO for the L1478 cloud. Each magnetic field segment is constructed by the position angle of the POS magnetic field (i.e. ϕ) and the inclination angle γ. Note that the magnetic field obtained is the projection along the LOS and averaged over 12 × 12 pixels for visualization purposes. The third axis of the LOS is for 3D visualization purposes and does not provide distance information here. The total intensity map I is placed on the POS, i.e. the l–b plane.

Figure 12.

An visulization of the CNN-predicted 3D magnetic fields using ¹³CO for the L1478 cloud. Each magnetic field segment is constructed by the position angle of the POS magnetic field (i.e. ϕ) and the inclination angle γ. Note that the magnetic field obtained is the projection along the LOS and averaged over 12 × 12 pixels for visualization purposes. The third axis of the LOS is for 3D visualization purposes and does not provide distance information here. The total intensity map I is placed on the POS, i.e. the l–b plane.

Open in new tab Download slide

5 DISCUSSION

5.1 Comparison with earlier studies

The realm of exploring magnetic fields within the ISM through CNN is experiencing swift advancements. As a pilot study presented by Xu, Law & Tan (2023), the Convolutional Approach to Structure Identification-3D (CASI-3D) model was employed to map the 2D POS magnetic field orientation. This is achieved similarly by using the velocity channel maps obtained from spectroscopic observations. The underlying physics principle is still founded on the anisotropic MHD turbulence. The training process underpinning this approach uses the emission lines of ¹²CO and ¹³CO (J = 1–0), generated through the RADMC-3D code (Dullemond et al. 2012).

In this study, we introduce a new CNN model. This advanced model is designed with the aim of predicting not merely the orientation ϕ of the POS magnetic field but extends to encompass the angle of field inclination, γ, as well as the total Alfvén Mach number M_A. This approach allows the construction of 3D magnetic field vectors. For training the CNN model, we have utilized emission lines from ¹³CO and C¹⁸O (J = 1–0), with data generated from the SPARX code (Hsieh et al. 2019).

We quantify the uncertainty of our CNN-predicted ϕ and γ. We found that the median value and the dispersion of uncertainty for C¹⁸O are approximately ∼2.22° and ∼3.20° under sub-Alfv’enic conditions (〈M_A〉 ≈ 0.5). These values shift to ∼12.08° and ∼13.60° under super-Alfvénic conditions (〈M_A〉 ≈ 2.0). When compared to the CASI-3D model, our CNN model demonstrates higher accuracy, as CASI-3D exhibits a median uncertainty of ∼6.2° and ∼18.4° under comparable sub-Alfvénic and super-Alfvénic conditions, respectively. Through the application of our CNN model to the L1478 molecular cloud, we successfully constructed the first 3D magnetic field map. The corresponding CNN-predicted POS magnetic field orientation shows remarkable alignment with that inferred from Planck 353 GHz polarization data.

It is crucial to acknowledge that despite the differences inherent between the CNN models used by Xu, Law & Tan (2023) and this study, the fundamental concept of utilizing spectroscopic channel maps for magnetic field investigation remains the same: (1) the intensity distribution observable in thin channel maps is predominantly influenced by turbulent velocity statistics, as outlined in (Lazarian & Pogosyan 2000; Kandel, Lazarian & Pogosyan 2016; Hu et al. 2023); and (2) these channel maps capture the anisotropy intrinsic to MHD turbulence, thereby revealing the orientation of the POS magnetic field (Lazarian & Yuen 2018; Hu et al. 2023). A crucial insight was provided by Hu, Lazarian & Xu (2021c), highlighting that the degree of anisotropy in channel maps, as well as the magnetic field topology, is regulated by both the γ and the M_A. These are parameters that can be extracted efficiently using the CNN approach.⁴ Thus, drawing upon these foundational theoretical studies, we propose the use of the CNN model as an efficient tool for tracing 3D magnetic fields, providing convincing physical reasons for interpreting its feasibility.

5.2 Synergy with other methods

Our newly proposed CNN model stands as a powerful complement to existing methodologies in the field. One notable technique, which involves utilizing polarized dust emission, has proven effective in tracing the 3D magnetic field orientation within diffuse clouds, where dust grains are perfectly aligned with magnetic fields (Chen et al. 2019; Hu & Lazarian 2023a, c). However, this technique may encounter limitations within dense cloud environments, for example, those observed through tracing by C¹⁸O, where dust grains might not maintain perfect alignment (Lazarian 2007; Andersson, Lazarian & Vaillancourt 2015). This loss of alignment, resulting in a phenomenon known as the polarization hole (Pattle et al. 2019; Seifried et al. 2019; Hoang et al. 2021), introduces uncertainties when tracing 3D magnetic fields through polarized dust emission techniques.

Unlike these traditional approaches, the CNN approach remains immune to the effects of the polarization hole. When the CNN model is supplied with emission lines from dense tracers like C¹⁸O, HNC, and NH₃, it proves highly adept at probing the 3D magnetic fields present within dense clouds effectively. Nonetheless, it’s important to consider that within these dense cloud environments, the forces of self-gravity can become a significant factor. This gravitational influence might induce alterations in the anisotropy observed within channel maps (Hu, Lazarian & Yuen 2020b). Therefore, it becomes imperative to input the CNN model with carefully selected numerical simulations before applying it to observational data to ensure accurate and reliable results.

Furthermore, it should be noted that the inclination angle predicted by the CNN model is inherently limited to the range of [0, 90°]. This limitation arises because the anisotropy within channel maps alone cannot definitively discern whether the magnetic field is oriented towards or away from the observer. However, recent advancements in the field, particularly in Faraday rotation measurements within molecular clouds (Tahani et al. 2019, 2022), offer promising avenues to resolve this degeneracy.

Another relevant method worth discussing is the Velocity Gradient Technique (VGT; González-Casanova & Lazarian 2017; Hu, Yuen & Lazarian 2018; Lazarian & Yuen 2018). Like our proposed CNN approach, the VGT is a technique that traces magnetic fields using spectroscopic observations. Importantly, both the CNN approach and VGT share a foundational physical principle: they rely on the anisotropy of MHD turbulence observed within thin channel maps. With VGT having undergone extensive and rigorous testing (Hu et al. 2019; Lu, Lazarian & Pogosyan 2020; Hu, Lazarian & Stanimirović 2021b; Alina et al. 2022; Liu, Hu & Lazarian 2022a; Schmaltz, Hu & Lazarian 2023; Tram et al. 2023; Hu & Lazarian 2023b; Liu et al. 2023b), it is established as an excellent benchmark for evaluating the accuracy of CNN models, especially in situations where polarization measurements are not readily available. This benchmarking is crucial when CNNs are deployed for tracing 3D Galactic Magnetic Fields, highlighting the important comparative and complementary roles these techniques play in advancing our understanding of magnetic fields in various astrophysical contexts.

5.3 Prospects of the CNN method

In the present study, we introduced a CNN model adept at predicting 3D magnetic fields within molecular clouds, utilizing spectroscopic observations of molecular gas. However, the potential applications of this CNN method extend far beyond, encompassing various astrophysical environments and contexts, including neutral hydrogen (H i) regions, ionized gas, the Central Molecular Zone (CMZ), external galaxies, and supernova remnants. In the following sections, we outline several promising applications of this methodology.

5.3.1 3D Galactic Magnetic Fields

A deep and comprehensive understanding of the 3D Galactic Magnetic Field (GMF; Jansson & Farrar 2012) is paramount for addressing a host of astrophysical inquiries. These include identifying the origins of ultra-high energy cosmic rays (Farrar 2014; Farrar & Sutherland 2019) and refining models of Galactic foreground polarization (Kovetz & Kamionkowski 2015; Planck Collaboration et al. 2016).

Recent research indicates that thin channel maps of H i successfully capture the anisotropy inherent in MHD turbulence (Lazarian & Pogosyan 2000; Lazarian & Yuen 2018; Lu, Lazarian & Pogosyan 2020; Hu et al. 2023). Consequently, applying the CNN to H i channel maps constitutes a viable strategy for mapping 3D GMFs. Past efforts aimed at modelling the foreground polarization with H i primarily focused on mapping the POS magnetic field orientation (Clark & Hensley 2019; Lu, Lazarian & Pogosyan 2020; Hu, Yuen & Lazarian 2020a). These endeavours largely neglected the crucial depolarization factor, the inclination angle. However, the advent of sophisticated multiphase H i simulations (Ho, Yuen & Lazarian 2021) has made it possible to train the CNN model for accurate predictions of 3D GMFs, yielding more realistic models of the foreground polarization.

Our primary goal in this paper is to explore the magnetic fields of molecular clouds, for which the isothermal approximation is applicable. Multiphase H i requires separate training of the neural network. For multiphase H i, where cooling and heating play a significant role, our general approach remains valid: intensity features/striations within channel maps continue to elongate along the POS magnetic field orientation. This is supported by several studies Lazarian & Yuen (2018); Clark & Hensley (2019); Lu, Lazarian & Pogosyan (2020); Hu, Yuen & Lazarian (2020a); Hu et al. (2023). These intensity features/striations are also regulated by the Alfvén Mach number (M_A) and the projection effect associated with the inclination angle. However, additional physics, such as thermal instability, could modify the observed anisotropy, for instance, potentially leading to a smaller aspect ratio (Ho, Yuen & Lazarian 2023). The corresponding study employing our approach for multiphase H i will be provided elsewhere.

5.3.2 3D magnetic fields in CMZ and external galaxies

Understanding the magnetic fields within cold molecular gas is essential for deciphering the processes of formation and fueling of Seyfert nuclei. Recent measurements of magnetic fields within the CMZ and in other Seyfert galaxies have been conducted using various techniques. These include far-infrared polarization observations from instruments like SOFIA/HAWC + (Lopez-Rodriguez et al. 2021), JCMT (Pattle et al. 2021), and ALMA (Lopez-Rodriguez et al. 2020), as well as employing the VGT (Hu, Lazarian & Wang 2022a; Hu et al. 2022c; Liu et al. 2023b). However, these approaches primarily yield the POS magnetic field orientation, falling short of providing a comprehensive 3D perspective. Nevertheless, the successful application of VGT confirms the viability of using anisotropy in molecular emission channel maps as a tracer for magnetic fields in these environments. For instance, Hu, Lazarian & Wang (2022a) derived a POS magnetic field map surrounding Sgr A* using the [Ne ii] emission line and Paschen-α image observed with the Hubble Space Telescope (HST). Given these advances, extending the CNN methodology to incorporate optical/near-infrared observations from instruments like the HST and the JWST is a feasible and promising approach for predicting 3D magnetic fields in both the Galactic Centre and external galaxies.

5.4 Obtaining the full 3D magnetic field vector

3D magnetic fields, encompassing both orientation and strength, play a pivotal role in comprehending key astrophysical phenomena. These include processes such as star formation (Mestel 1965; Mac Low & Klessen 2004; McKee & Ostriker 2007; Federrath & Klessen 2012; Lazarian, Esquivel & Crutcher 2012; Hu, Lazarian & Stanimirović 2021b), the effects of stellar feedback (Pattle et al. 2022; Liu, Hu & Lazarian 2023a), as well as the acceleration and propagation of cosmic rays (Fermi 1949; Jokipii 1966; Yan & Lazarian 2002; Xu & Yan 2013; Xu & Lazarian 2020; Beattie et al. 2022; Hu, Lazarian & Xu 2022b; Lazarian & Xu 2023). Traditionally, to obtain the strength of these fields, the Davis–Chandrasekhar–Fermi (DCF) method is employed, which typically combines dust polarimetry with spectroscopic observations (see Davis 1951; Chandrasekhar & Fermi 1953). However, this often proves insufficient. the DCF method gives only the POS magnetic field strength, while the component along the LOS is missing. Other limitations of the DCF method have also been thoroughly dissected in the literature (Skalidis et al. 2021; Chen et al. 2022; Lazarian, Yuen & Pogosyan 2022; Liu, Qiu & Zhang 2022b).

In light of this, an alternative approach has been proposed: the use of the Alfvén Mach number M_A with the sonic Mach number M_s to derive the magnetic field’s strength (Lazarian, Yuen & Pogosyan 2020). This method, aptly termed MM2, can be used to obtain the total strength, particularly since the vital term M_A is readily available with the CNN approach proposed in this study. The sonic Mach number M_s can be procured either directly via spectroscopic line broadening or by leveraging a CNN approach similar to our current study. Coupled with the 3D magnetic field orientation, this equips us with the necessary tools to construct a 3D magnetic field vector.

6 SUMMARY

In this study, a CNN model was designed for the intricate task of probing 3D magnetic fields within molecular clouds. This model is not confined to determining the POS magnetic field orientation but extends its capabilities to accurately ascertain the field’s inclination angle and the total Alfvén Mach number, offering a more comprehensive understanding of the magnetic field in the observed regions. We summarize our major results below:

We developed a CNN model for probing the 3D magnetic fields, including the POS magnetic field orientation, inclination angle, and total Alfvén Mach number.
The CNN model was trained using synthetic ¹³CO and C¹⁸O (J = 1–0) emission lines, encompassing a range of conditions from sub-Alfvénic to super-Alfvénic. We quantified the uncertainties associated with the trained CNN model’s predictions. Our findings revealed that the uncertainties are less than 5° for both ϕ and γ, and are smaller than 0.2 for M_A under sub-Alfvénic conditions (with M_A ≈ 0.5). Under super-Alfvénic conditions (with M_A ≈ 2.0), the uncertainties increased slightly but remained below 15° for ϕ and γ, and were around 1.5 for M_A.
We implemented our trained CNN model to analyse the molecular cloud L1478. The CNN-predicted POS magnetic field orientation exhibited remarkable agreement with orientations inferred from Planck 353 GHz polarization data, with a marginal global difference of approximately 10°.
This study facilitated the construction of the first 3D magnetic field map for the L1478 cloud. Through our analysis, we found that the cloud’s global inclination angle is approximately 76°, while the global total Alfvén Mach number is close to 1.07.
We discussed the potential applications and future prospects of the CNN approach. Particularly, we discussed the feasibility and potential of utilizing the CNN model for predicting 3D GMFs. We also considered its application for understanding 3D magnetic fields in the CMZ and external galaxies.

Acknowledgement

YH and AL acknowledge the support of NASA ATP AAH7546, NSF grants AST 2307840, and ALMA SOSPADA-016. Financial support for this work was provided by NASA through award 09_0231 issued by the Universities Space Research Association, Inc. (USRA). This work used SDSC Expanse CPU at SDSC through allocations PHY230032, PHY230033, PHY230091, and PHY230105 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296. YH acknowledges the very kind computational and technical support of Bowen Cao.

DATA AVAILABILITY

The data underlying this article will be shared on reasonable request to the corresponding author.

Footnotes

The impact of galactic rotation on velocity caustics was explored by Lazarian & Pogosyan (2000). It demonstrated that its effects are insignficant (Hu et al. 2023).

The use of wing channels has its own advantages through increasing the ratio of velocity to density fluctuations (Yuen, Ho & Lazarian 2021; Hu et al. 2023).

AM = 1 implies a perfect parallel alignment, while −1 indicates perpendicularity.

Note that while the anisotropy and magnetic field topology, that are sensitive to γ and the M_A, are the most apparent features in channel maps, it is also possible the CNN extracts additional features to facilitate the prediction.

References

Abbate

Possenti

Tiburzi

Barr

van Straten

Ridolfi

Freire

2020

Nat. Astron.

704

Month:	Total Views:
January 2024	121
February 2024	18
March 2024	43
April 2024	46
May 2024	25
June 2024	35
July 2024	36
August 2024	15
September 2024	43
October 2024	51
November 2024	34
December 2024	30
January 2025	26
February 2025	53
March 2025	23
April 2025	6
May 2025	5

Article Contents

Probing three-dimensional magnetic fields: II – an interpretable Convolutional Neural Network

ABSTRACT

1 INTRODUCTION

2 THEORETICAL CONSIDERATION

2.1 Anisotropy of MHD turbulence: revealing magnetic field orientation and magnetization

2.2 Obtaining velocity information from spectroscopic observation

2.3 Anisotropy in thin velocity channels: dependence on the inclination angle of magnetic fields

3 NUMERICAL METHOD

3.1 Convolutional neural network (CNN)

3.2 Convolutional layer

3.3 Batch normalization layer

3.4 Pooling layer

3.5 Fully connected layer

3.6 Network training

3.7 MHD simulations

3.8 Emission lines of 13CO and C18O

3.9 Training images

4 RESULTS

4.1 Numerical training and tests

4.2 Observational prediction

5 DISCUSSION

5.1 Comparison with earlier studies

5.2 Synergy with other methods

5.3 Prospects of the CNN method

5.3.1 3D Galactic Magnetic Fields

5.3.2 3D magnetic fields in CMZ and external galaxies

5.4 Obtaining the full 3D magnetic field vector

6 SUMMARY

Acknowledgement

DATA AVAILABILITY

Footnotes

References

APPENDIX A: Anisotropy’s degeneracy on γ and MA

Citations

Views

Altmetric

Email alerts

Astrophysics Data System

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only

3.8 Emission lines of ¹³CO and C¹⁸O

APPENDIX A: Anisotropy’s degeneracy on γ and M_A