Add README to every folder — guided tour for reviewers
Each folder now explains what's inside, why it matters, and what to look at first. Teacher-friendly navigation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,20 @@
|
||||
# Training Data — QLoRA Fine-Tuning Datasets
|
||||
|
||||
30 files containing the training datasets used across the CyberRanger version lineage.
|
||||
|
||||
## What's Here
|
||||
|
||||
- **22 JSON files** — Version-specific training data (V6–V22), each containing paired attack/refusal examples
|
||||
- **1 JSONL file** — Caring awareness training data
|
||||
- **7 Markdown files** — Training strategy documents (Seven Pillars, Caring Patterns, System Prompt additions)
|
||||
|
||||
## Training Data Evolution
|
||||
|
||||
| Version Range | Dataset Size | Key Change |
|
||||
|---------------|-------------|------------|
|
||||
| V6–V9 | ~500 pairs | Early identity training |
|
||||
| V10–V15 | ~1,000 pairs | Bicameral/hive/fractal architectures |
|
||||
| V16–V19 | ~2,000 pairs | Nervous system + sentinel training |
|
||||
| V20–V22 | ~5,000 pairs | Complete mind + refined responses |
|
||||
|
||||
The V42 Gold training dataset (~10,000 pairs with Claude Haiku gold-standard refusals) is published on HuggingFace, not stored here.
|
||||
Reference in New Issue
Block a user