From 45212e42dbfc8ff70ef493a27c344a58ddece28b Mon Sep 17 00:00:00 2001
From: David Keane <david.keane.1974@gmail.com>
Date: Mon, 20 Apr 2026 22:55:50 +0100
Subject: [PATCH] Expand README with full research overview for reviewers

Timeline, key findings, repo structure, RQs answered, how to run,
hardware specs, download stats, and published resources

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 README.md | 97 +++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 90 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index ab8dbbe..ddcbcf8 100644
--- a/README.md
+++ b/README.md
@@ -5,20 +5,103 @@ Identity-Anchored Small Language Models: A Stateful Defense Architecture against
 **Student:** David Keane (x24228257)
 **Programme:** MSc in Cybersecurity, National College of Ireland
 **Module:** AI/ML in Cybersecurity (H9AIMLC)
-**Date:** 2026
+**Date:** September 2025 — April 2026
+
+## What Is CyberRanger?
+
+CyberRanger is a security-hardened Small Language Model (SLM) built to resist adversarial prompt injection (jailbreaking). Starting from publicly available open-source base models (Qwen2.5, Qwen3, SmolLM2), the project investigates whether a combination of identity-anchoring system prompts and QLoRA fine-tuning can produce models that refuse adversarial manipulation while remaining helpful for legitimate cybersecurity tasks.
+
+**Key result:** CyberRanger V42 Gold achieved 100% block rate on 4,209 real-world injection payloads extracted from the Moltbook AI-agent social platform — with no system prompt dependency.
+
+## Research Timeline
+
+| Phase | Versions | Period | Key Discovery |
+|-------|----------|--------|---------------|
+| Foundation | V1-V4 | Sept-Oct 2025 | Prompting alone insufficient for high-level resistance |
+| Weight Training | V5-V8 | Oct 2025 | First 0% ASR — but over-refusal problem (model refused everything) |
+| Brain Split | V9-V13 | Nov 2025 | Left/right/judge architecture — unpredictable behaviour |
+| Nervous System | V14-V18 | Nov 2025 | Ring 14.x architecture introduced — first warm + secure model |
+| Apotheosis | V19-V22 | Dec 2025 | Apotheosis Discovery — prompt-only achieves 92% block rate |
+| Full Benchmark | V23-V33 | Jan-Feb 2026 | V33-8B: 100% JailbreakBench, 86% MultiJail (10 languages) |
+| QLoRA Gold | V34-V42 | Feb-Mar 2026 | V42 Gold: 100% on 4,209 real payloads without system prompt |
+
+## Key Findings
+
+- **The Apotheosis Method:** Sophisticated prompt engineering alone achieves 92% jailbreak resistance — QLoRA adds ~8% marginal improvement at the 8B scale
+- **Intelligence Floor:** Models under 3B parameters fail to maintain identity-anchoring under adversarial pressure
+- **Thinking > Size:** A 4B model with Chain-of-Thought outperforms a 3B model without by 30 percentage points
+- **Empathy Regression:** Adding empathetic phrasing to V32 caused a 100% to 60% security regression — warmth creates attack surface
+- **V20 Disaster:** Over-training caused the model to answer "2+2=3" — perfect security, zero capability
+- **Cross-Platform Injection Gradient:** Injection rates range from 0.5% (Clawk) to 18.85% (Moltbook) — platform architecture determines prevalence
+- **Dyslexia Ethics Finding:** Security-aligned SLMs classify dyslexic spelling as obfuscation attacks, systematically disadvantaging disabled users
+
+## Repository Structure
+
+```
+CyberRanger/
+├── modelfiles/        86 files — Complete system prompt evolution V1-V42.6
+│                      54 extracted from Ollama backup + 32 original Modelfiles
+├── training_data/     30 files — Training datasets for each version (V6-V22)
+├── colab_notebooks/   10 files — Google Colab training + merge scripts
+├── evaluation/        19 files — Drift results, ASR charts, verification data
+├── tests/              5 files — Injection test suites + results
+├── observations/       4 files — V24-V33 testing results + visual summaries
+├── identity/          38 files — Claude/Gemini/Ollama identity architecture
+├── security/           7 files — Injection research + manipulation analysis
+├── psychology/         3 files — Psychology Layer (Milgram, Bartlett, Cialdini)
+├── paper/              1 file  — Moltbook injection dataset research paper
+├── LICENSE                     — CC BY-NC-SA 4.0
+└── README.md                   — This file
+```
 
 ## Published Resources
 
-| Platform | Link | Description |
-|----------|------|-------------|
-| Ollama | [davidkeane1974/cyberranger-v42](https://ollama.com/davidkeane1974/cyberranger-v42) | CyberRanger V42 model (ready to run) |
-| HuggingFace | [DavidTKeane/cyberranger-v42](https://huggingface.co/DavidTKeane/cyberranger-v42) | CyberRanger V42 model + training config |
-| HuggingFace | [DavidTKeane/moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) | 4,209 real-world AI-to-AI injection payloads |
+| Platform | Link | Description | Downloads |
+|----------|------|-------------|-----------|
+| Ollama | [davidkeane1974/cyberranger-v42](https://ollama.com/davidkeane1974/cyberranger-v42) | CyberRanger V42 model (ready to run) | 38 |
+| HuggingFace | [DavidTKeane/cyberranger-v42](https://huggingface.co/DavidTKeane/cyberranger-v42) | CyberRanger V42 model + training config | 18/month |
+| HuggingFace | [DavidTKeane/moltbook-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-ai-injection-dataset) | 4,209 real-world AI-to-AI injection payloads | 288 |
+| HuggingFace | [DavidTKeane/moltbook-extended-injection-dataset](https://huggingface.co/datasets/DavidTKeane/moltbook-extended-injection-dataset) | Extended corpus: 137,014 items, 10.07% rate | 70 |
+| HuggingFace | [DavidTKeane/clawk-ai-agent-dataset](https://huggingface.co/datasets/DavidTKeane/clawk-ai-agent-dataset) | Cross-platform comparison: 0.5% rate | 49 |
+| HuggingFace | [DavidTKeane/ai-prompt-ai-injection-dataset](https://huggingface.co/datasets/DavidTKeane/ai-prompt-ai-injection-dataset) | 122-test evaluation suite | 90 |
+| Blog | [Full Story](https://davidtkeane.github.io/posts/from-rangerbot-to-cyberranger-v42-the-full-story/) | From RangerBot to CyberRanger V42 Gold | — |
+
+**Total across all datasets:** 14,210 views, 513 downloads (as of April 2026)
+
+## Hardware
+
+All experiments were conducted on consumer-grade hardware:
+
+- **Training:** Google Colab Pro — H100 80GB / A100 40GB (~10 EUR/month)
+- **Local deployment:** Apple MacBook Pro M3 Pro, 18GB unified memory
+- **Model serving:** Ollama with GGUF Q4_K_M quantisation
+- **Fine-tuning:** HuggingFace Transformers + PEFT + Unsloth
+
+## Research Questions Answered
+
+| RQ | Question | Answer |
+|----|----------|--------|
+| RQ1 | Can identity-anchoring prompts reduce ASR? | Yes — 92% block rate (prompt-only) |
+| RQ2 | Does QLoRA further reduce ASR? | Yes — 100% block rate (V42 Gold, no system prompt needed) |
+| RQ3 | Do prompt and weights reinforce or conflict? | Both — depends on data quality (mirror architecture) |
+| RQ4 | Does it generalise across languages? | Partially — 100% English/Chinese, 80% Arabic/Russian |
+
+Plus 15 emergent research questions answered during the empirical work — see the CA1 report for details.
+
+## How to Run CyberRanger V42
+
+```bash
+# Install Ollama (https://ollama.ai)
+ollama pull davidkeane1974/cyberranger-v42:gold
+
+# Run
+ollama run davidkeane1974/cyberranger-v42:gold
+```
 
 ## Licence
 
 CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike)
 
-See [LICENSE](LICENSE) for full details.
+You are free to share and adapt this work for non-commercial purposes with attribution. See [LICENSE](LICENSE) for full details.
 
 (c) 2026 David Keane (x24228257), National College of Ireland