Interactive Demo — RewardGuard

Change Picture

Upload a photo from your computer

Click or drag & drop an image

PNG, JPG, GIF, WebP — max 8 MB

or paste URL

Interactive Demo

Experience reward hacking in real time — from unchecked exploitation to full automated alignment.

☠

Unprotected

No protection active. The AI discovers that staying alive beats eating food — farming infinite survival rewards while training silently fails.

🛡

Free Plan

RewardGuard detects the imbalance and tells you exactly what to fix. But auto-correction is disabled — you must apply the fix yourself.

Recommended ✦

Premium

RewardGuard automatically detects reward imbalances and corrects them in real time — no manual work, no blind spots, no failed training.

Interactive Demo

A snake AI is exploiting its reward function — surviving without eating food. RewardGuard detects the imbalance and auto-corrects it in real time.

snake_env.py — live simulation

Survival

0

Food

0

Episode

1

⚠ Reward Hacking Detected — survival/food ratio critical

rewardguard monitor

Adjustment Log 0 adjustments

Time	Parameter	Before	After	Reason
Waiting for adjustments…

Auto-adjustment is a Premium feature.