Scenario
Metric
Samples ≤ 1000 & Features ≤ 100
Samples > 1000 | Features > 1000
based on wins and losses in simulated game
based on difference to best-performing model
Classic
Deep Learning
Foundation Model
Model | Elo | Improvability | Average Rank | MRR |
|---|---|---|---|---|
Coxnet (Tuned) classic | 1124.9 [1070.8, 1187.7] | 0.056 [0.038, 0.072] | 4.58 | 0.343 |
PopSICL foundation | 1082.0 [1015.5, 1162.1] | 0.072 [0.044, 0.107] | 5.28 | 0.375 |
Coxnet classic | 1057.7 [1003.7, 1116.6] | 0.073 [0.048, 0.095] | 5.69 | 0.341 |
SSVM (Tuned) classic | 1049.7 [974.8, 1127.6] | 0.073 [0.053, 0.099] | 5.83 | 0.291 |
DeepSurv deep | 1012.3 [953.0, 1079.3] | 0.089 [0.064, 0.114] | 6.50 | 0.236 |
RSF classic | 1000.0 [957.3, 1068.2] | 0.080 [0.050, 0.112] | 6.72 | 0.222 |
RankDeepSurv deep | 998.5 [943.5, 1056.1] | 0.090 [0.063, 0.119] | 6.75 | 0.216 |
RSF (Tuned) classic | 965.0 [909.5, 1024.7] | 0.089 [0.062, 0.118] | 7.36 | 0.193 |
GBSE (Tuned) classic | 965.0 [902.9, 1040.9] | 0.088 [0.065, 0.110] | 7.36 | 0.265 |
SSVM classic | 888.7 [821.9, 954.6] | 0.131 [0.106, 0.159] | 8.75 | 0.190 |
GBSE classic | 879.3 [796.2, 959.4] | 0.110 [0.081, 0.140] | 8.92 | 0.160 |
TabPFN* foundation | 855.5 [769.6, 947.3] | 0.152 [0.100, 0.199] | 9.33 | 0.188 |
DeepHit deep | 809.1 [741.7, 881.8] | 0.175 [0.124, 0.227] | 10.11 | 0.138 |
DeepWeiSurv (p=2) deep | 556.0 [439.2, 631.9] | 0.241 [0.189, 0.289] | 13.28 | 0.079 |
DeepWeiSurv (p=1) deep | 526.7 [401.7, 619.5] | 0.251 [0.201, 0.296] | 13.53 | 0.080 |