Fig. 1.
Schematic representation of ska lo analysis of WGS data. The circled numbers correspond to the algorithm's steps as described in the main text, except for the optional step 5 (SNP positioning), which is omitted due to space constraints. The numbering next to the sequences represents their names. The graph’s orange and red nodes represent entry and exit nodes, respectively. In this example, the WGS dataset consisted of ten samples, containing three SNPs and one indel, with split k-mers extracted using k = 7. ska lo inferred three SNPs (highlighted by rectangles in the path pseudoalignment), while the indel was detected but not inferred as its fraction of missing data was 0.4.