Overview of CellMATE. CellMATE uniquely utilizes a multi-head adversarial training module to enable nonlinear early-integration of sc-multiomics. The input multimodal data (X), concatenation of features from all modalities, is simultaneously used to learn a modal-free low-dimensional stochastic latent space z. The modal-specific adversarial networks which learn the modal-specific distribution and noise can then auto-tune z. Adversarial loss (Ladv) and recovery loss (Lrec) are introduced alongside the evidence lower bound (LELBO) to ensure the accuracy and reliability of z and the reconstructed multimodal data. The figure is created with Biorender.