Overview

"The Resistance" is a social deduction game where players take on the roles of loyal resistance members or spies working against them. This project involves creating AI agents that can analyze, adapt, and thrive in the dynamic and unpredictable environment of the game.

Problem Definition

State: The state of the game can be described by mission history, the players in the games and beliefs of the agent.

Actions: Agent would be able to do three actions propose a mission, vote on a mission and betray a mission if they’re a spy

Outcomes: Based on the agent’s actions a mission team can be rejected or accepted, and mission can pass or fail.

Goals: For an agent who is part of the resistance the goal is to maximize the number of passed mission and minimize the number of failed mission, and if agent is spy the goal is to do the opposite.

Methodologies

Agents for The Resistance: MiniMax, Reinforcement Q-Learning, and Hidden Markov Model

MiniMax Agent

While the MiniMax algorithm is typically used for zero-sum games, it can be employed here to generate a game state as a decision tree where nodes reflect mission proposals, voting outcomes, and mission success or failure. The agent aims to maximize its team’s chances of success:

For the resistance team, it minimizes the chance of spies infiltrating missions.
For the spy team, it maximizes the likelihood of sabotaging missions without revealing their identity.

However, since this is not a zero-sum game with only two players, this approach results in very high space and time complexity.

Reinforcement Q-Learning Agent

A Reinforcement Learning (RL) agent tracks rewards for its actions after each round. For example:

Voting "Yes" for a proposed team: If the team fails, the Q-value for voting "Yes" for teams containing the same players decreases. Similarly, the Q-value for proposing such teams also decreases.
Conversely, if the mission succeeds, the Q-values increase.

If the agent is a spy, it tracks when missions fail, maintaining an internal state of the team size, number of betrayals causing failure, and number of spies in the team. It rewards missions with a favorable ratio of spies to team members for voting to betray.

Hidden Markov Model (HMM) Agent

The HMM agent is well-suited for identifying spies, leveraging probabilistic reasoning to make decisions based on player behavior.

Hidden Variable: Whether a player is a spy or not.
Observations: Key observations include:
- The player is part of a failed team.
- The player voted for a failed team.
- The player proposed a failed team.

Initial Belief: The agent starts with a 1/3 probability that any player is a spy.

Belief Updating: Trust increases for players part of successful missions and decreases for those involved in failed missions or who voted for a failing team.

Actions:

Votes for or against proposed teams based on trust levels of the players involved.
Proposes teams prioritizing players with the highest trust levels.
If the agent is a spy, it betrays only when the required number of betrayals matches to ensure the mission’s failure.

Performance

The HMM agent outperforms reference agents across tournaments, achieving the highest win-rate across 10 tournaments.

Technologies Used

Python

Machine Learning

Bayesian Networks

Heuristic Algorithms

Game Theory

Conclusion

This project highlights the potential of AI in social deduction games. By leveraging advanced strategies and technologies, the AI agents demonstrate the ability to adapt and excel, providing valuable insights into human behavior and decision-making.