Scenario

Metric

Classic

Deep Learning

Foundation Model

Model

Elo

Improvability

Average Rank

MRR

Coxnet (Tuned)

classic

1124.9

[1070.8, 1187.7]

0.056

[0.038, 0.072]

4.58

0.343

PopSICL

foundation

1082.0

[1015.5, 1162.1]

0.072

[0.044, 0.107]

5.28

0.375

Coxnet

classic

1057.7

[1003.7, 1116.6]

0.073

[0.048, 0.095]

5.69

0.341

SSVM (Tuned)

classic

1049.7

[974.8, 1127.6]

0.073

[0.053, 0.099]

5.83

0.291

DeepSurv

deep

1012.3

[953.0, 1079.3]

0.089

[0.064, 0.114]

6.50

0.236

RSF

classic

1000.0

[957.3, 1068.2]

0.080

[0.050, 0.112]

6.72

0.222

RankDeepSurv

deep

998.5

[943.5, 1056.1]

0.090

[0.063, 0.119]

6.75

0.216

RSF (Tuned)

classic

965.0

[909.5, 1024.7]

0.089

[0.062, 0.118]

7.36

0.193

GBSE (Tuned)

classic

965.0

[902.9, 1040.9]

0.088

[0.065, 0.110]

7.36

0.265

SSVM

classic

888.7

[821.9, 954.6]

0.131

[0.106, 0.159]

8.75

0.190

GBSE

classic

879.3

[796.2, 959.4]

0.110

[0.081, 0.140]

8.92

0.160

TabPFN*

foundation

855.5

[769.6, 947.3]

0.152

[0.100, 0.199]

9.33

0.188

DeepHit

deep

809.1

[741.7, 881.8]

0.175

[0.124, 0.227]

10.11

0.138

DeepWeiSurv (p=2)

deep

556.0

[439.2, 631.9]

0.241

[0.189, 0.289]

13.28

0.079

DeepWeiSurv (p=1)

deep

526.7

[401.7, 619.5]

0.251

[0.201, 0.296]

13.53

0.080