Ransomware Detection

Experiments

10 experiment runs logged.

GINinternal_only

First baseline experiment. Train GIN on the internal_only FCG dataset using a standard stratified 70/15/15 train/val/test split to establish a performance reference for graph-based ransomware detection.

Accuracy

98.2%

F1 Macro

97.8%

Recall

100.0%

AUROC

99.0%

Feb 17, 2026 · 135 epochs · 884s
GINfull_fcg

Train GIN on the full_fcg dataset (all methods, internal + external) to compare with the internal_only result from Experiment 02.

Accuracy

95.4%

F1 Macro

94.7%

Recall

100.0%

AUROC

97.7%

Feb 17, 2026 · 113 epochs · 935s
GCNinternal_only

Train GCN on the internal_only dataset to compare GNN architectures. Same hyperparameters and data split as GIN experiments.

Accuracy

96.3%

F1 Macro

95.6%

Recall

93.8%

AUROC

98.4%

Feb 17, 2026 · 108 epochs · 1057s
GCNfull_fcg

Train GCN on the full_fcg dataset to complete the GCN comparison across both graph representations.

Accuracy

94.4%

F1 Macro

93.5%

Recall

93.8%

AUROC

98.9%

Feb 17, 2026 · 101 epochs · 1449s
GATinternal_only

Train GAT on the internal_only dataset to complete the three-model architecture comparison (GIN, GCN, GAT).

Accuracy

96.3%

F1 Macro

95.6%

Recall

96.9%

AUROC

98.5%

Feb 18, 2026 · 100 epochs · 2047s
GATfull_fcg

Train GAT on the full_fcg dataset to complete the 3-model x 2-dataset baseline grid.

Accuracy

98.2%

F1 Macro

97.8%

Recall

96.9%

AUROC

99.4%

Feb 18, 2026 · 113 epochs · 3646s
GINinternal_onlyCross-Validation

4-fold cross-validation of GIN on internal_only to verify that the strong baseline result from Experiment 02 is not due to a lucky train/test split. Each fold uses a different 25% of the data as the test set.

Accuracy

95.9%

F1 Macro

95.4%

Recall

99.5%

AUROC

98.8%

Mar 4, 2026 · 109 epochs · 3562s
GINinternal_onlyFamily Holdout

Hold out simplelocker and wipelocker families entirely from training. Test whether GIN can detect ransomware families it has never seen during training. These two families were chosen as the first holdout test.

Accuracy

95.7%

F1 Macro

95.2%

Recall

100.0%

AUROC

96.0%

Mar 4, 2026 · 100 epochs · 962s
GINinternal_onlyFamily Holdout

Hold out wannalocker and blackroselucy families from training. Same setup as Experiment 10 but with different held-out families to test whether generalisation depends on which families are excluded.

Accuracy

54.5%

F1 Macro

44.0%

Recall

11.8%

AUROC

89.9%

Mar 4, 2026 · 100 epochs · 973s
GINinternal_onlyLOFO

Full leave-one-family-out (LOFO) evaluation of GIN on internal_only. Trains 6 separate models, each time holding out one ransomware family entirely, to systematically measure whether the baseline can detect ransomware families not seen during training.

Accuracy

70.7%

F1 Macro

48.6%

Recall

18.1%

AUROC

85.1%

Mar 5, 2026 · 111 epochs · 6850s