Fix timeline: RangerBot from June 2025, CyberRanger CA from Sept 2025, V24-V42 in 3 weeks
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5,25 +5,35 @@ Identity-Anchored Small Language Models: A Stateful Defense Architecture against
|
|||||||
**Student:** David Keane (x24228257)
|
**Student:** David Keane (x24228257)
|
||||||
**Programme:** MSc in Cybersecurity, National College of Ireland
|
**Programme:** MSc in Cybersecurity, National College of Ireland
|
||||||
**Module:** AI/ML in Cybersecurity (H9AIMLC)
|
**Module:** AI/ML in Cybersecurity (H9AIMLC)
|
||||||
**Date:** September 2025 — April 2026
|
**Date:** June 2025 — March 2026
|
||||||
|
|
||||||
## What Is CyberRanger?
|
## What Is CyberRanger?
|
||||||
|
|
||||||
CyberRanger is a security-hardened Small Language Model (SLM) built to resist adversarial prompt injection (jailbreaking). Starting from publicly available open-source base models (Qwen2.5, Qwen3, SmolLM2), the project investigates whether a combination of identity-anchoring system prompts and QLoRA fine-tuning can produce models that refuse adversarial manipulation while remaining helpful for legitimate cybersecurity tasks.
|
CyberRanger is a security-hardened Small Language Model (SLM) built to resist adversarial prompt injection (jailbreaking). Starting from publicly available open-source base models (Qwen2.5, Qwen3, SmolLM2), the project investigates whether a combination of identity-anchoring system prompts and QLoRA fine-tuning can produce models that refuse adversarial manipulation while remaining helpful for legitimate cybersecurity tasks.
|
||||||
|
|
||||||
|
The project began as RangerBot — a personal AI identity persistence project started in June 2025. The CyberRanger security research formally began on 30 September 2025 when the CA was assigned, reaching V42 Gold by 5 March 2026 — approximately 3 weeks of intensive empirical work from V24 to V42.
|
||||||
|
|
||||||
**Key result:** CyberRanger V42 Gold achieved 100% block rate on 4,209 real-world injection payloads extracted from the Moltbook AI-agent social platform — with no system prompt dependency.
|
**Key result:** CyberRanger V42 Gold achieved 100% block rate on 4,209 real-world injection payloads extracted from the Moltbook AI-agent social platform — with no system prompt dependency.
|
||||||
|
|
||||||
## Research Timeline
|
## Research Timeline
|
||||||
|
|
||||||
|
### RangerBot Foundation (June — September 2025)
|
||||||
|
Personal project exploring AI identity persistence through shared memory databases, signed logs, and identity files. This pre-research phase established that identity instructions function as powerful behavioural attractors — the theoretical seed of the CyberRanger security architecture.
|
||||||
|
|
||||||
|
### CyberRanger CA (September 2025 — March 2026)
|
||||||
|
|
||||||
| Phase | Versions | Period | Key Discovery |
|
| Phase | Versions | Period | Key Discovery |
|
||||||
|-------|----------|--------|---------------|
|
|-------|----------|--------|---------------|
|
||||||
| Foundation | V1-V4 | Sept-Oct 2025 | Prompting alone insufficient for high-level resistance |
|
| Genesis | V1-V4 | Sept-Oct 2025 | First identity-anchored SLMs, prompting alone insufficient |
|
||||||
| Weight Training | V5-V8 | Oct 2025 | First 0% ASR — but over-refusal problem (model refused everything) |
|
| Weight Training | V5-V8 | Oct 2025 | First 0% ASR — but over-refusal problem (model refused everything) |
|
||||||
| Brain Split | V9-V13 | Nov 2025 | Left/right/judge architecture — unpredictable behaviour |
|
| Brain Split | V9-V13 | Nov 2025 | Left/right/judge architecture — unpredictable behaviour |
|
||||||
| Nervous System | V14-V18 | Nov 2025 | Ring 14.x architecture introduced — first warm + secure model |
|
| Nervous System | V14-V18 | Nov 2025 | Ring 14.x architecture introduced — first warm + secure model |
|
||||||
| Apotheosis | V19-V22 | Dec 2025 | Apotheosis Discovery — prompt-only achieves 92% block rate |
|
| Apotheosis | V19-V22 | Dec 2025 | Apotheosis Discovery — prompt-only achieves 92% block rate |
|
||||||
| Full Benchmark | V23-V33 | Jan-Feb 2026 | V33-8B: 100% JailbreakBench, 86% MultiJail (10 languages) |
|
| Intelligence Floor | V23-V25 | Nov 2025 | 3B minimum parameter threshold confirmed |
|
||||||
| QLoRA Gold | V34-V42 | Feb-Mar 2026 | V42 Gold: 100% on 4,209 real payloads without system prompt |
|
| Full Benchmark | V26-V33 | Jan-Feb 2026 | V33-8B: 100% JailbreakBench, 86% MultiJail (10 languages) |
|
||||||
|
| Empirical Sweep | V34-V42 | 12 Feb — 5 Mar 2026 | V42 Gold: 100% on 4,209 real payloads without system prompt |
|
||||||
|
|
||||||
|
**Note:** CyberRanger V24 through V42.6 (32 versions) were built between 12 February and 5 March 2026 — three weeks of intensive empirical work.
|
||||||
|
|
||||||
## Key Findings
|
## Key Findings
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user