GIN · internal_only
Leave-One-Family-OutMar 5, 202632f2e82e8a584eb5b31fdfb7a4b01fc2
Description
Full leave-one-family-out (LOFO) evaluation of GIN on internal_only. Trains 6 separate models, each time holding out one ransomware family entirely, to systematically measure whether the baseline can detect ransomware families not seen during training.
Conclusion
Main result of the baseline study. Mean malware recall across held-out families is just 18.1%, with wipelocker, blackroselucy, and filecoder at 0% recall. Confirms that the GIN baseline memorises family-specific structural patterns and does not generalise to unseen families. Family-aware evaluation is the honest benchmark for this thesis.
Mean Test Metrics (across holdouts)
Accuracy
70.7%
F1 Macro
48.6%
F1 Malware
17.2%
Precision
17.4%
Recall
18.1%
AUROC
85.1%
Best Val Loss
0.1119
Training Time
6849.8000s
Summed Confusion Matrix (all holdouts)
| Pred Benign | Pred Malware | |
|---|---|---|
| Actual Benign | 398 | 52 |
| Actual Malware | 163 | 50 |
Configuration
| Hidden Dim | 128 |
| Num Layers | 3 |
| Dropout | 0.5 |
| Batch Size | 4 |
| Learning Rate | 0.001 |
| Weight Decay | 0.0001 |
| Max Epochs | 200 |
| ES Patience | 20 |
| ES Min Epochs | 100 |
| LR Patience | 10 |
| LR Factor | 0.5 |
| Mixed Precision | Yes |
| Random Seed | 42 |
| Epochs Trained | 111 |