GIN · internal_only
Cross-ValidationMar 4, 20265b6a5bdf21f74f2cbe4a7f2e702bc624
Description
4-fold cross-validation of GIN on internal_only to verify that the strong baseline result from Experiment 02 is not due to a lucky train/test split. Each fold uses a different 25% of the data as the test set.
Conclusion
Strong performance persists across all folds: mean accuracy 95.9% (min 92.1%, max 98.9%). Confirms the baseline is stable under repeated in-distribution splits. However, CV folds still mix ransomware families across train and test, so this does not test family generalisation.
Mean Test Metrics (across folds)
Accuracy
95.9%
F1 Macro
95.4%
F1 Malware
93.7%
Precision
88.8%
Recall
99.5%
AUROC
98.8%
Best Val Loss
0.1265
Training Time
3562.1000s
Summed Confusion Matrix (all folds)
| Pred Benign | Pred Malware | |
|---|---|---|
| Actual Benign | 474 | 28 |
| Actual Malware | 1 | 212 |
Configuration
| Hidden Dim | 128 |
| Num Layers | 3 |
| Dropout | 0.5 |
| Batch Size | 4 |
| Learning Rate | 0.001 |
| Weight Decay | 0.0001 |
| Max Epochs | 200 |
| ES Patience | 20 |
| ES Min Epochs | 100 |
| LR Patience | 10 |
| LR Factor | 0.5 |
| Mixed Precision | Yes |
| Random Seed | 42 |
| Epochs Trained | 109 |