Figure 4.
Gorodkin measure on the different sets depending on the split method, with the value on top of each column. The vertical line on top of each bar displays the standard deviation. Although training and validation performances are calculated using a k-fold scheme, the test set and hold out set are not. This might explain some of the short differences in performance between them, due to sets with unequal difficulty for the model.