. | Native/ESMFold . | ProteinMPNN . | ProstT5 . | ProstT5(RoundTrip70) . |
---|---|---|---|---|
lDDT ↑ | 0.78 ± 0.01 | 0.77 ± 0.01 | 0.72 ± 0.01 | 0.73 ± 0.01 |
RMSD ↓ | 2.55 ± 0.01 | 2.61 ± 0.01 | 2.90 ± 0.01 | 2.81 ± 0.01 |
TM-score ↑ | 0.62 ± 0.02 | 0.61 ± 0.02 | 0.58 ± 0.02 | 0.60 ± 0.02 |
PIDE | 100 ± 0 | 29.6 ± 1 | 21.9 ± 0.9 | 22.4 ± 0.9 |
Entropy ↓ | 0.13 ± 0.01 | 0.39 ± 0.03 | 0.20 ± 0.01 | 0.19 ± 0.01 |
. | Native/ESMFold . | ProteinMPNN . | ProstT5 . | ProstT5(RoundTrip70) . |
---|---|---|---|---|
lDDT ↑ | 0.78 ± 0.01 | 0.77 ± 0.01 | 0.72 ± 0.01 | 0.73 ± 0.01 |
RMSD ↓ | 2.55 ± 0.01 | 2.61 ± 0.01 | 2.90 ± 0.01 | 2.81 ± 0.01 |
TM-score ↑ | 0.62 ± 0.02 | 0.61 ± 0.02 | 0.58 ± 0.02 | 0.60 ± 0.02 |
PIDE | 100 ± 0 | 29.6 ± 1 | 21.9 ± 0.9 | 22.4 ± 0.9 |
Entropy ↓ | 0.13 ± 0.01 | 0.39 ± 0.03 | 0.20 ± 0.01 | 0.19 ± 0.01 |
*Performance: structural similarity of ESMFold (8) and AlphaFold2 (32) predictions for native (Natural/ESMFold) and generated sequences in our test set. Sequences were generated using ProteinMPNN, ProstT5 and a filtered version of ProstT5 (ProstT5(rTrip70)) which uses the intrinsic back-translation of the model to filter by sequence similarity between native 3Di sequences and their counterpart predicted from generated AA sequences (3Di→AA→3Di). We generated AA sequences either until convergence (defined as ≥70 percentage pairwise sequence identity - PIDE - for 3Di letters) or after maximally ten attempts (to conserve resources). Single-sequence based ESMFold predictions for generated sequences were compared against the native ground-truth predicted by AlphaFold2 using lDDT (60), RMSD, TM-score (61), PIDE, and entropy (KL-divergence between the AA distribution in UniProt and the generated sequences). Error bars indicate 95% confidence intervals estimated from 1000 bootstrap samples. Arrows next to metrics indicate whether higher (↑) or lower (↓) values are better. For PIDE applied to inverse folding, it is not clear whether higher is necessarily better.
. | Native/ESMFold . | ProteinMPNN . | ProstT5 . | ProstT5(RoundTrip70) . |
---|---|---|---|---|
lDDT ↑ | 0.78 ± 0.01 | 0.77 ± 0.01 | 0.72 ± 0.01 | 0.73 ± 0.01 |
RMSD ↓ | 2.55 ± 0.01 | 2.61 ± 0.01 | 2.90 ± 0.01 | 2.81 ± 0.01 |
TM-score ↑ | 0.62 ± 0.02 | 0.61 ± 0.02 | 0.58 ± 0.02 | 0.60 ± 0.02 |
PIDE | 100 ± 0 | 29.6 ± 1 | 21.9 ± 0.9 | 22.4 ± 0.9 |
Entropy ↓ | 0.13 ± 0.01 | 0.39 ± 0.03 | 0.20 ± 0.01 | 0.19 ± 0.01 |
. | Native/ESMFold . | ProteinMPNN . | ProstT5 . | ProstT5(RoundTrip70) . |
---|---|---|---|---|
lDDT ↑ | 0.78 ± 0.01 | 0.77 ± 0.01 | 0.72 ± 0.01 | 0.73 ± 0.01 |
RMSD ↓ | 2.55 ± 0.01 | 2.61 ± 0.01 | 2.90 ± 0.01 | 2.81 ± 0.01 |
TM-score ↑ | 0.62 ± 0.02 | 0.61 ± 0.02 | 0.58 ± 0.02 | 0.60 ± 0.02 |
PIDE | 100 ± 0 | 29.6 ± 1 | 21.9 ± 0.9 | 22.4 ± 0.9 |
Entropy ↓ | 0.13 ± 0.01 | 0.39 ± 0.03 | 0.20 ± 0.01 | 0.19 ± 0.01 |
*Performance: structural similarity of ESMFold (8) and AlphaFold2 (32) predictions for native (Natural/ESMFold) and generated sequences in our test set. Sequences were generated using ProteinMPNN, ProstT5 and a filtered version of ProstT5 (ProstT5(rTrip70)) which uses the intrinsic back-translation of the model to filter by sequence similarity between native 3Di sequences and their counterpart predicted from generated AA sequences (3Di→AA→3Di). We generated AA sequences either until convergence (defined as ≥70 percentage pairwise sequence identity - PIDE - for 3Di letters) or after maximally ten attempts (to conserve resources). Single-sequence based ESMFold predictions for generated sequences were compared against the native ground-truth predicted by AlphaFold2 using lDDT (60), RMSD, TM-score (61), PIDE, and entropy (KL-divergence between the AA distribution in UniProt and the generated sequences). Error bars indicate 95% confidence intervals estimated from 1000 bootstrap samples. Arrows next to metrics indicate whether higher (↑) or lower (↓) values are better. For PIDE applied to inverse folding, it is not clear whether higher is necessarily better.
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.