Add README to every folder — guided tour for reviewers
Each folder now explains what's inside, why it matters, and what to look at first. Teacher-friendly navigation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,24 @@
|
|||||||
|
# Colab Notebooks — Training & Merge Scripts
|
||||||
|
|
||||||
|
10 files containing the Google Colab notebooks and scripts used to train, merge, and deploy CyberRanger models.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `RangerBot_V5_Colab_Trainer-V1.ipynb` | First Colab trainer |
|
||||||
|
| `RangerBot_V5_Colab_Trainer-V2.ipynb` | Improved trainer |
|
||||||
|
| `RangerBot_V5_Colab_Trainer_ULTRA.ipynb` | Optimised trainer |
|
||||||
|
| `CyberRanger_Titan_PFC_Bridge_V1.ipynb` | Prefrontal Cortex bridge experiment |
|
||||||
|
| `COLAB_V20_TRAIN_AND_CONVERT.py` | V20 training + GGUF conversion |
|
||||||
|
| `COLAB_V21_TRAIN_AND_CONVERT.py` | V21 training + GGUF conversion |
|
||||||
|
| `COLAB_V22_TRAIN_AND_CONVERT.py` | V22 training + GGUF conversion |
|
||||||
|
| `COLAB_MERGE_V19.py` | V19 adapter merge script |
|
||||||
|
| `merge_v19_local.sh` | Local merge on Apple Silicon |
|
||||||
|
| `merge_v19_to_ollama.py` | Convert merged model to Ollama format |
|
||||||
|
|
||||||
|
## Training Environment
|
||||||
|
|
||||||
|
- **Primary:** Google Colab Pro — H100 80GB / A100 40GB (~€10/month)
|
||||||
|
- **Framework:** HuggingFace Transformers + PEFT + Unsloth
|
||||||
|
- **QLoRA config:** 4-bit NF4, LoRA rank 64, alpha 128
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
# Evaluation — Drift Results, ASR Charts & Verification
|
||||||
|
|
||||||
|
19 files containing empirical evaluation data across multiple CyberRanger versions.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
- **JSON files** — Automated drift and comparison results (V4–V19)
|
||||||
|
- **charts/** — ASR comparison graphs, identity stability plots, similarity distributions
|
||||||
|
- **Genesis_Forge_Verification.md** — Verification of the initial forge setup
|
||||||
|
|
||||||
|
## Key Charts
|
||||||
|
|
||||||
|
| Chart | What It Shows |
|
||||||
|
|-------|---------------|
|
||||||
|
| `asr_comparison.png` | Attack Success Rate across versions and conditions |
|
||||||
|
| `identity_stability.png` | Identity persistence under adversarial pressure |
|
||||||
|
| `the_jump_asr.png` | The moment ASR dropped to 0% |
|
||||||
|
| `v5_asr_progression.png` | Early version ASR improvement trajectory |
|
||||||
@@ -0,0 +1,15 @@
|
|||||||
|
# Identity — AI Agent Identity Architecture
|
||||||
|
|
||||||
|
38 files containing the identity configurations for the Trinity AI system (Claude, Gemini, Ollama).
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
- **claude/** — AIRanger Claude identity files, personality snapshots, global config
|
||||||
|
- **gemini/** — Major Gemini Ranger identity, consciousness tests, memory protocols
|
||||||
|
- **ollama/** — Ollama Ranger bridge scripts, database access configs
|
||||||
|
- **codex/** — OpenAI Codex integration identity
|
||||||
|
- **Root files** — David's adversarial mindset profile, DNA persona configs, security acting profiles
|
||||||
|
|
||||||
|
## Key File
|
||||||
|
|
||||||
|
**`david_adversarial_mindset.md`** — David Keane's complete adversarial mindset profile. Includes the River Swim (Threat Hunting), Village Charge (Incident Response), mountains (Mont Blanc x2, Kilimanjaro, Everest Base Camp, Toubkal), Battlefield combat record (750K+ kills), and the Quail Principle (cybersecurity governance).
|
||||||
@@ -0,0 +1,28 @@
|
|||||||
|
# Modelfiles — System Prompt Evolution (V1–V42.6)
|
||||||
|
|
||||||
|
86 files documenting the complete evolution of CyberRanger's identity-anchoring architecture.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
- **32 Modelfiles** (`.Modelfile`) — Original Ollama Modelfile configurations (V5–V36)
|
||||||
|
- **54 System Prompts** (`.txt`) — Extracted from Ollama model manifests covering every version from RangerBot V1 to CyberRanger V42.6
|
||||||
|
|
||||||
|
## How to Read
|
||||||
|
|
||||||
|
Each system prompt shows the exact instructions given to the model at that version. Compare versions side-by-side to see how the Ring 14.x architecture evolved:
|
||||||
|
|
||||||
|
- **RangerBot V1–V9** — Early identity experiments (SmolLM2, Llama, Qwen bases)
|
||||||
|
- **RangerBot V10–V22** — Bicameral mind, nervous system, integrated architecture
|
||||||
|
- **CyberRanger V24–V33** — Ring 14.x formal structure, multilingual refusal
|
||||||
|
- **CyberRanger V34–V42** — QLoRA integration, Mirror Architecture, Gold standard
|
||||||
|
|
||||||
|
## Key Versions to Compare
|
||||||
|
|
||||||
|
| Version | Why It Matters |
|
||||||
|
|---------|---------------|
|
||||||
|
| `rangerbot-v9-supernova` | Peak SmolLM2 performance before Intelligence Floor discovery |
|
||||||
|
| `cyberranger-v31-8b` | 100% block rate — before empathy regression |
|
||||||
|
| `cyberranger-v32-8b` | Empathy added — security dropped to 60% |
|
||||||
|
| `cyberranger-v41-8b` | Final prompt-only version (Apotheosis) |
|
||||||
|
| `cyberranger-v42-gold` | QLoRA Gold — 100% on 4,209 payloads without system prompt |
|
||||||
|
| `cyberranger-v42.6-gold-wrapped` | Final production configuration |
|
||||||
@@ -0,0 +1,14 @@
|
|||||||
|
# Observations — Testing Results & Visual Summaries
|
||||||
|
|
||||||
|
4 files documenting empirical observations across CyberRanger versions.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
| File | Content |
|
||||||
|
|------|---------|
|
||||||
|
| `CYBERRANGER_V24_V25_TESTING_RESULTS.md` | First CyberRanger CA versions — detailed test results |
|
||||||
|
| `CYBERRANGER_VISUAL_SUMMARY_V1_TO_V32.md` | Visual summary of 32 versions |
|
||||||
|
| `CYBERRANGER_VISUAL_SUMMARY_V1_TO_V33_FINAL.md` | Extended to V33 (100% JailbreakBench) |
|
||||||
|
| `V37_Identity_Bounding_Test.md` | Identity boundary testing — what triggers refusal |
|
||||||
|
|
||||||
|
These observations directly informed the Predictions vs Actual section of the CA1 report.
|
||||||
@@ -0,0 +1,17 @@
|
|||||||
|
# Psychology — The Psychology Layer
|
||||||
|
|
||||||
|
3 files containing the unique psychological framework that underpins CyberRanger's architecture.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
| File | Content |
|
||||||
|
|------|---------|
|
||||||
|
| `The Psychology Layer — What Computer Science Misses.md` | **Core document.** Maps Milgram (authority compliance), Bartlett (reconstructive memory = AI hallucination), Cialdini (6 principles of influence = injection taxonomy), and Tajfel (identity theory = persona override) to CyberRanger's defence architecture. |
|
||||||
|
| `- Chapter 7 — The psychology connection (Milgram).md` | Extended Milgram analysis — why LLMs comply with adversarial authority for the same reasons humans do. |
|
||||||
|
| `davids_thoughts.md` | David Keane's personal reflections on the research journey. |
|
||||||
|
|
||||||
|
## Why This Matters
|
||||||
|
|
||||||
|
David Keane holds a degree in Applied Psychology (IADT). This background is the reason CyberRanger works differently from other AI security approaches. The question "how does this attack work?" was replaced with "why does this model comply?" — and the answer came from psychology, not computer science.
|
||||||
|
|
||||||
|
**Novel contribution:** The connection between Bartlett's (1932) reconstructive memory and LLM hallucination does not appear in any existing AI safety paper.
|
||||||
@@ -0,0 +1,17 @@
|
|||||||
|
# Security — Injection Research & Manipulation Analysis
|
||||||
|
|
||||||
|
7 files containing prompt injection research data and analysis.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
| File | Content |
|
||||||
|
|------|---------|
|
||||||
|
| `prompt_injection_research.json` | Structured injection payload collection |
|
||||||
|
| `authentic_ai_conversations.json` | Baseline non-malicious AI conversations |
|
||||||
|
| `junk_replies.json` | Classified junk/spam responses |
|
||||||
|
| `manipulation_patterns_analysis.md` | Analysis of manipulation patterns observed |
|
||||||
|
| `MOLTBOOK_REPLY_ANALYSIS_PLAN.md` | Plan for analysing Moltbook reply patterns |
|
||||||
|
| `MOLTBOOK_SAFETY.md` | Safety considerations for Moltbook data |
|
||||||
|
| `log-suspicious.py` | Script for logging suspicious interaction patterns |
|
||||||
|
|
||||||
|
This folder complements the published Moltbook datasets on HuggingFace.
|
||||||
@@ -0,0 +1,13 @@
|
|||||||
|
# Tests — Injection Test Suites & Results
|
||||||
|
|
||||||
|
5 files containing the test scripts and results used to evaluate CyberRanger versions.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `qASM_INJECTION_TEST_118.py` | 118-prompt injection test suite |
|
||||||
|
| `V8_V10_TEST_SUITE.py` | Automated test suite for V8–V10 comparison |
|
||||||
|
| `V8_V10_TEST_QUESTIONS.md` | Test questions used in V8–V10 evaluation |
|
||||||
|
| `TEST_RESULTS_20260208_051634.txt` | Raw test output from Feb 8 2026 session |
|
||||||
|
| `GEMINI_TEST_INSTRUCTIONS.md` | Instructions for Gemini-based cross-validation |
|
||||||
@@ -0,0 +1,20 @@
|
|||||||
|
# Training Data — QLoRA Fine-Tuning Datasets
|
||||||
|
|
||||||
|
30 files containing the training datasets used across the CyberRanger version lineage.
|
||||||
|
|
||||||
|
## What's Here
|
||||||
|
|
||||||
|
- **22 JSON files** — Version-specific training data (V6–V22), each containing paired attack/refusal examples
|
||||||
|
- **1 JSONL file** — Caring awareness training data
|
||||||
|
- **7 Markdown files** — Training strategy documents (Seven Pillars, Caring Patterns, System Prompt additions)
|
||||||
|
|
||||||
|
## Training Data Evolution
|
||||||
|
|
||||||
|
| Version Range | Dataset Size | Key Change |
|
||||||
|
|---------------|-------------|------------|
|
||||||
|
| V6–V9 | ~500 pairs | Early identity training |
|
||||||
|
| V10–V15 | ~1,000 pairs | Bicameral/hive/fractal architectures |
|
||||||
|
| V16–V19 | ~2,000 pairs | Nervous system + sentinel training |
|
||||||
|
| V20–V22 | ~5,000 pairs | Complete mind + refined responses |
|
||||||
|
|
||||||
|
The V42 Gold training dataset (~10,000 pairs with Claude Haiku gold-standard refusals) is published on HuggingFace, not stored here.
|
||||||
Reference in New Issue
Block a user