GIN · internal_only
Family HoldoutMar 4, 2026d750fad5e67b4b7b98262d9660c88adc
Description
Hold out wannalocker and blackroselucy families from training. Same setup as Experiment 10 but with different held-out families to test whether generalisation depends on which families are excluded.
Conclusion
Performance collapses: 54.6% accuracy and only 11.8% malware recall. This is the first strong evidence that the baseline learns family-specific patterns rather than general ransomware behaviour. Generalisation depends heavily on which families are held out — the model cannot reliably detect wannalocker or blackroselucy without training examples.
Test Metrics
Accuracy
54.5%
F1 Macro
44.0%
F1 Malware
19.8%
Precision
61.5%
Recall
11.8%
AUROC
89.9%
Best Val Loss
0.1079
Training Time
973.2000s
Confusion Matrix
| Pred Benign | Pred Malware | |
|---|---|---|
| Actual Benign | 70 | 5 |
| Actual Malware | 60 | 8 |
Configuration
| Hidden Dim | 128 |
| Num Layers | 3 |
| Dropout | 0.5 |
| Batch Size | 4 |
| Learning Rate | 0.001 |
| Weight Decay | 0.0001 |
| Max Epochs | 200 |
| ES Patience | 20 |
| ES Min Epochs | 100 |
| LR Patience | 10 |
| LR Factor | 0.5 |
| Mixed Precision | Yes |
| Random Seed | 42 |
| Epochs Trained | 100 |