AIFinOpsProductivity

I A/B tested compressed agent instructions and found the breaking point

Your AI coding agent burns 15,000–20,000 tokens on instruction files before you type a word. I found exactly where compression breaks behavior — and built a tool to automate the safe zone.

Read the full article on DEV ↗ GitHub repo

The Compression Cliff

There's a threshold where compression stops being lossless. Beyond ~47–54% total reduction, the model's compliance with safety rules becomes probabilistic instead of deterministic.

Content TypeSafe ReductionStrategy
Paths, references, lists60–70%Maximum compression
Personality, style rules50–60%Heavy compression
Safety rules, preferences20–30%Formatting only
Code examples0%No compression

Interactive Demo

Drag the compression line to see how token streams degrade past the cliff:

The Compression Cliff — Interactive

Drag the line or use the slider to explore compression levels

45%

Uncompressed input

CLAUDE.md, steering files, skills — loaded at session start

Safe zone (<47%)

Personality, paths, tool lists compress 60–70% with zero loss

Transition (47–54%)

Compliance becomes probabilistic

Past the cliff (>54%)

Safety rules ignored — behavior breaks

Quick Start

pip install context-compress

# LLM compression (best results, needs kiro-cli)
context-compress llm ~/.kiro/steering/ -o ~/.kiro/steering-compressed/

# Regex compression (fast, offline)
context-compress compress-dir ~/.kiro/steering/ -o ~/.kiro/steering-compressed/

# Find duplicates across your context stack
context-compress dedup ~/.kiro/steering/

# Token usage stats
context-compress stats ~/.kiro/steering/

Key Findings

LLM compression beats regex 9×

Regex compression on lean steering files: 2.7%. LLM semantic compression on the same files: 24%. The LLM understands which words carry meaning and which are scaffolding.

Redundancy in safety rules is reinforcement

Merging 8 safety bullets into 3 sentences (same meaning, 54% reduction) made compliance probabilistic. The verbose version asked permission every time; the merged version asked 1 out of 3 times.

Four levers, not one

LayerStrategyTarget
1Skills over steeringLoad prose on demand
2Cache-aware orderingStable content above dynamic
3LLM compressionSemantic compression of remaining prose
4TOON encodingToken-efficient structured payloads

Read the full article ↗ View on GitHub