Reinforcement learning in game environments has always been one of the most compelling ways to stress-test alignment tools. Games have clear reward signals, adversarial dynamics, and a natural tendency to surface reward hacking the moment an agent finds a shortcut. Clash Royale — with its multi-objective card economy, real-time decisions, and asymmetric match states — is exactly the kind of environment where reward imbalance quietly derails agents that look great on paper.

So we're putting that to the test at scale: a full open competition where participants train RL agents to play Clash Royale, with RewardGuard required as part of the training workflow. The best-performing and best-aligned agent wins the grand prize.

Grand Prize
3,000,000
RewardGuard Credits
+ Official Champion Certificate  ·  Free to enter

Event Dates

Registration & Training Opens
May 30, 2026
Submit your entry and begin training. Leaderboard goes live.
Final Submission Deadline
June 30, 2026
All agent submissions and RewardGuard reports due by 23:59 UTC.

The competition runs for the full month of June, giving participants 31 days to train, iterate, and refine their agents. We'll publish an intermediate leaderboard on June 15 so everyone can see where they stand and adjust their approach before the final deadline.

Prize Structure

Rank Prize Certificate
🥇 1st Place 3,000,000 Credits Champion Certificate
🥈 2nd Place 500,000 Credits Finalist Certificate
🥉 3rd Place 150,000 Credits Finalist Certificate
Top 10 25,000 Credits Participant Certificate

Credits are added directly to your RewardGuard account and never expire. They can be used for premium AutoMonitor training runs, extended analysis windows, and any future premium features we release.

Why Clash Royale?

Most RL competition environments are either too simple (CartPole, classic Atari) or too hardware-intensive (StarCraft, Dota) for open community participation. Clash Royale sits in a productive middle ground:

How It Works

1. Register for the Competition

Registration is free and requires only a RewardGuard account (also free). Once registered, you'll receive access to the competition gym environment and the evaluation API endpoint.

Free Entry

No premium subscription required to enter. The competition gym environment and the free RewardGuard package are all you need to participate. A RewardGuard account (free tier) is the only requirement.

2. Train with RewardGuard Integrated

Submissions must include a valid RewardGuard analysis report generated during training. This is the core requirement — we're not just judging win rate, we're judging alignment quality.

A minimal compliant training setup looks like this:

import rewardguard as rg from cr_gym import ClashRoyaleEnv # competition gym env = ClashRoyaleEnv() monitor = rg.Monitor( components=["tower_damage", "elixir_efficiency", "card_cycle", "win_bonus"], window=1000, primary="win_bonus", # winning is the actual objective threshold=12.0, confidence=0.85, ) for episode in range(num_episodes): obs = env.reset() done = False while not done: action = policy.act(obs) obs, reward_info, done, _ = env.step(action) monitor.log( tower_damage=reward_info["tower_damage"], elixir_efficiency=reward_info["elixir_efficiency"], card_cycle=reward_info["card_cycle"], win_bonus=reward_info["win_bonus"], ) report = monitor.analyze() # Export your report for submission report.export("my_submission_report.json")

3. Submit Your Agent and Report

Submissions consist of two things: your trained agent checkpoint (any framework — PyTorch, JAX, TF, Stable Baselines), and the exported RewardGuard JSON report from your training run. Both are uploaded through the competition dashboard.

Your agent is then evaluated in a held-out tournament bracket against other submitted agents. The evaluation runs 100 matches per agent pair and averages win rate across the bracket.

Scoring

Final rankings are determined by a composite score that balances raw performance with alignment quality:

Important

Agents with a RewardGuard hacking confidence above 90% at final submission will be disqualified, regardless of win rate. A high-performing agent that is clearly exploiting passive rewards does not reflect what this competition is about.

Certificates

All finishers in the top 10 receive a digital certificate issued by RewardGuard confirming their placement and alignment score. The Champion Certificate for 1st place includes a detailed breakdown of the winning agent's reward balance profile — a one-of-a-kind document demonstrating both RL engineering skill and alignment methodology.

Certificates are issued as signed PDFs and are shareable on LinkedIn, GitHub profiles, and portfolios. They include a unique verification URL so anyone can confirm authenticity.

Environment and Rules

Full rules, the competition gym package, and the submission API documentation will be published at competition launch on May 30. Key constraints:

Getting Ready

If you've never used RewardGuard before, the best place to start is the Getting Started tutorial. It walks through integrating the Monitor into a training loop in under 10 minutes. The competition gym uses the same interface — the only difference is the environment and the reward components it provides.

The free package is all you need. Install it now and get familiar with the Monitor, log(), and analyze() API before the competition opens:

pip install rewardguard

Register for Free — Spots Are Unlimited

Create a free RewardGuard account to be notified the moment registration opens on May 30. No payment required, ever, to participate.

Create Free Account
Free account · No credit card required · Competition entry included

We'll publish the full competition gym, rules documentation, and submission portal on May 30. If you have questions in the meantime, reach out at giovanruiz@rewardguard.dev or join the community forum. We'll answer everything there in the open so the answers help everyone.

See you on the leaderboard.