GIN · internal_only

Family HoldoutMar 4, 2026

d750fad5e67b4b7b98262d9660c88adc

Description

Hold out wannalocker and blackroselucy families from training. Same setup as Experiment 10 but with different held-out families to test whether generalisation depends on which families are excluded.

Conclusion

Performance collapses: 54.6% accuracy and only 11.8% malware recall. This is the first strong evidence that the baseline learns family-specific patterns rather than general ransomware behaviour. Generalisation depends heavily on which families are held out — the model cannot reliably detect wannalocker or blackroselucy without training examples.

Test Metrics

Accuracy

54.5%

F1 Macro

44.0%

F1 Malware

19.8%

Precision

61.5%

Recall

11.8%

AUROC

89.9%

Best Val Loss

0.1079

Training Time

973.2000s

Confusion Matrix

	Pred Benign	Pred Malware
Actual Benign	70	5
Actual Malware	60	8

Configuration

Hidden Dim	128
Num Layers	3
Dropout	0.5
Batch Size	4
Learning Rate	0.001
Weight Decay	0.0001
Max Epochs	200
ES Patience	20
ES Min Epochs	100
LR Patience	10
LR Factor	0.5
Mixed Precision	Yes
Random Seed	42
Epochs Trained	100