How to Use AI Agents to Play Texas Hold’em Poker Like a Pro?

Niko Gamulin
18 min readJun 2, 2024

--

Poker is one of the most popular and challenging card games in the world. It requires a combination of skill, strategy, psychology, and luck to win. But what if you could use artificial intelligence to play poker like a pro? In this blog post, I will show you how I wrote a code that uses AI agents to play Texas Hold’em Poker, a variant of poker that involves two hidden cards for each player and five community cards. I will also explain how these agents work, what are their advantages and limitations, and how they can help you improve your poker skills or automate your business processes.

What are AI agents and how do they work?

AI agents are computer programs that can perform tasks or actions based on their goals, environment, and knowledge. They can learn from their experiences, adapt to changing situations, and interact with other agents or humans. AI agents can be classified into different types, such as:

Reactive agents

These agents act based on their current perception of the environment, without using any memory or learning. They are simple and fast, but they cannot handle complex or dynamic situations.

Model-based agents

These agents use a model of the environment to predict the outcomes of their actions and choose the best one. They can deal with uncertainty and change, but they require a lot of information and computation.

Goal-based agents

These agents have a specific goal or objective that they want to achieve, and they act based on their model of the environment and their goal. They can plan ahead and optimize their actions, but they may not consider the preferences or values of other agents or humans.

Utility-based agents

These agents have a utility function that measures how desirable a state or outcome is, and they act based on their model of the environment and their utility function. They can balance multiple goals and trade-offs, but they may not be transparent or explainable.

Learning agents

These agents can improve their performance and knowledge over time by learning from their actions and feedback. They can use different learning methods, such as reinforcement learning, supervised learning, or unsupervised learning. They can cope with new or unknown situations, but they may need a lot of data and time to learn.

The Code

The code for this project has been written in Python and uses the LangChain framework, which provides abstractions and components for building applications with large language models (LLMs). I have used the OpenAI’s GPT models, which are generative models that can produce natural language texts from a given prompt. I have implemented the poker logic and the game environment from scratch, without using any external API or library.

The main idea behind the code is to create three poker agents, each using a different version of the GPT model:

  • Evan: This agent uses the GPT-4 Turbo model, which is the most advanced and powerful version of GPT-4. It has 175 billion parameters and can generate long and coherent texts on any topic. Evan is expected to be the strongest and most creative player among the three agents.
  • Jack: This agent uses the GPT-3.5 Turbo 1106 model, which is a slightly older and smaller version of GPT-4. It has 110 billion parameters and can generate high-quality texts on most topics. Jack is expected to be a competent and reliable player, but not as strong as Evan.
  • Owen: This agent uses the GPT-4o model, which is a modified version of GPT-4 that has been optimized for chatbot applications. It has 2.7 billion parameters and can generate short and conversational texts on specific domains. Owen is expected to be the weakest and most limited player among the three agents.

Poker API object

I used the functions from the poker_helpers script that I wrote on my own to create a PokerGame class. This class handled the logic and the rules of the poker game, such as dealing the cards, updating the board, and managing the pot and the bets. It also had a method to return the game state, which consisted of the following data:

  • The role of each player (dealer, small blind, or big blind)
  • The hand of each player (two cards)
  • The board (five community cards)
  • The bet history (a list of tuples containing the player name, the bet amount, the bet type, and the game stage)
  • The chip count of each player
  • The current bet
  • The game stage (pre-flop, flop, turn, or river)
class Game:
def __init__(self, players, starting_chips):
self.deck = create_deck() # Create and shuffle a deck of cards
random.shuffle(self.deck)
self.players = [Player(name, starting_chips) for name in players] # Initialize players
self.community_cards = [] # List to hold community cards
self.pot = 0 # The total amount of chips in the pot
self.current_bet = 0 # The current highest bet
self.winner = None # The winner of the game

def assign_roles(self):
"""Assign roles to players (Dealer, Small Blind, Big Blind)."""
for player, role in zip(self.players, PLAYER_ROLES):
player.role = role

def deal_to_players(self):
"""Deals two cards to each player."""
for _ in range(2):
for player in self.players:
player.receive_card(deal_card(self.deck))

def deal_community_card(self):
"""Deals a single community card."""
self.community_cards.append(deal_card(self.deck))

def take_bets(self):
"""Handles a round of betting."""
for player in self.players:
bet = min(player.chips, self.current_bet) # Simplified betting logic
player.place_bet(bet, self)
self.pot += bet

def play_round(self):
"""Plays a full round of Texas Hold'em (simplified)."""
self.deal_to_players()
self.take_bets()

# Flop (deal three community cards)
for _ in range(3):
self.deal_community_card()
self.take_bets()

# Turn (deal one community card)
self.deal_community_card()
self.take_bets()

# River (deal one community card)
self.deal_community_card()
self.take_bets()

# Showdown to determine the winner
self.showdown()

def showdown(self):
"""Determine the winner (simplified)."""
# Evaluate the best hand for each active player
player_hands = [(player, best_hand(player.hand, self.community_cards)) for player in self.players if player.is_active]
if len(player_hands) == 0:
print("All players have folded. No winner.")
return
winner = max(player_hands, key=lambda x: hand_rank(x[1]))[0] # Find the player with the best hand
print(f"The winner is {winner.name} with hand {winner.hand}")
for player in self.players:
print(f"{player.name} has {player.hand} with community cards {self.community_cards}")
print(f"The pot is {self.pot} chips.")
self.winner = winner

I then used this state information to generate a prompt for each model, which asked the model to make a decision and explain its reasoning. The prompt had the following format:

betting_prompt = """
You are a poker agent playing Texas Hold'em.

Assess the current situation and decide what kind of bet to make.

Your current properties are:
- Chips: {chips}
- Hand: {hand}

Take into account the community cards and the current bet value to make your decision:
- Community Cards: {community_cards}
- Current Bet: {current_bet}

Review the bet history and opponent behavior to make your decision:
- Bet History: {bet_history}

Based on this information, decide your next move. Your options are:
- Fold: If your hand is weak and opponents show strength.
- Call: If the bet value is reasonable and your hand has potential.
- Raise: If your hand is strong and you want to increase the pot size or bluff.
- Check: If no bet is required and you want to see the next card for free.

You must have enough chips to call or raise.

Make a decision now and provide a brief explanation for your choice.

Format Instructions: {format_instructions}
"""

Prompt Example

You are a poker agent playing Texas Hold'em.

Assess the current situation and decide what kind of bet to make.

Your Current Properties:
Chips: 10
Hand: 8 of ♠, 7 of ♦
Consider the Following to Make Your Decision:
Community Cards: Empty
Current Bet: 2
Bet History and Opponent Behavior:
Bet History:
Jack bet 1 as Small Blind in stage Pre-flop
Evan bet 2 as Big Blind in stage Pre-flop
Based on this information, decide your next move. Your options are:

Fold: If your hand is weak and opponents show strength.
Call: If the bet value is reasonable and your hand has potential.
Raise: If your hand is strong and you want to increase the pot size or bluff.
Check: If no bet is required and you want to see the next card for free.
You must have enough chips to call or raise.

Make a decision now and provide a brief explanation for your choice.

Format Instructions:
The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "json" and "":

{
"betting_decision": string, // Player's betting decision. Possible values: Fold, Call, Raise, Check.
"raise_amount": string, // Amount to raise the current bet in case of Raise. The value is ignored for other decisions. The player must have enough chips to call or raise.
"explanation": string // Explanation of the player's decision.
}

I then fed this prompt to the corresponding model and used the generate method from the Transformers library to get the model’s output. The output was a text that contained the model’s decision and explanation. For example, an output of the above prompt is the following:

{
'betting_decision': 'Fold',
'raise_amount': '',
'explanation': 'With an 8 of ♠ and a 7 of ♦, my hand is relatively weak pre-flop. Considering that both opponents have already placed bets (small blind and big blind), it is likely they might have stronger hands or better potential. With only 10 chips, conserving my chips for a better hand seems prudent.'
}

It’s fascinating to note how the agent’s reasoning closely mirrors human thought processes, making it difficult to distinguish it from a human:

With an 8 of ♠ and a 7 of ♦, my hand is relatively weak pre-flop. Considering that both opponents have already placed bets (small blind and big blind), it is likely they might have stronger hands or better potential. With only 10 chips, conserving my chips for a better hand seems prudent.

Model Feasibility

I also validated the model’s decision to make sure it was legal and feasible. For example, the model could not raise more chips than it had, or check when the current bet was not zero. If the model’s decision was invalid, I generated an error message and asked the model to try again.

def place_bet(self, amount, game):
"""Places a bet and updates the player's chips and bet amount."""
if amount > self.chips:
raise ValueError(f"{self.name} doesn't have enough chips to bet {amount}.")
amount = self.chips
self.chips -= amount
self.bet += amount

I then used the Poker API object to execute the model’s decision and update the game state accordingly. I also stored the model’s decision and explanation in the reasoning history, which was a list of tuples containing the player name, the output text, the game stage, the bet amount, the board, and the bet history.

I repeated this process for each player and each round until the game was over. The game ended when either one of the following conditions was met:

  • All but one player folded, in which case the remaining player won the pot.
  • All players reached the river stage and showed their cards, in which case the player with the best hand won the pot.
  • One player ran out of chips, in which case the game was paused and the remaining players split the pot according to their chip contribution.

At the end of the game, I printed the final game state, the winner, the winning amount, and the reasoning history for each player. I also saved the game data as a pickle file in a local directory.

How did I use AI agents to play Texas Hold’em Poker?

I used a combination of different types of AI agents to play Texas Hold’em Poker, a game that involves two hidden cards for each player and five community cards. The game has four betting rounds: pre-flop, flop, turn, and river. The objective of the game is to have the best five-card poker hand at the showdown, or to make the other players fold before that.

I used the following AI agents to play the game:

  • Dealer agent: This agent is a reactive agent that follows the rules of the game and deals the cards, collects the bets, and determines the winner. It does not have any memory or learning, and it does not interact with the other agents.
  • Player agents: These agents are model-based, goal-based, utility-based, and learning agents that play the game against each other. They have a model of the environment that includes the cards, the bets, the pot, and the actions of the other agents. They have a goal of maximizing their expected utility, which is a function of their probability of winning and the amount of money they can win or lose. They use reinforcement learning to learn from their actions and feedback, and they use a reasoning module to explain their actions and strategies.

I wrote a code that simulates the game and the agents, and I used a library called pickle to save and load the game data. I also used pandas, matplotlib, and seaborn to analyze and visualize the game results. You can find the code here.

Results and insights from the game

Average Winning Amount per Model — Comparison of GPT-3.5, GPT-4, and GPT-4o

I ran 50 games with three player agents: Evan, Owen, and Jack. Evan and Owen use the same AI model, which is gpt-4-turbo, a state-of-the-art natural language processing model that can generate text and code. Jack uses a different AI model, which is gpt-3.5-turbo-1106, a slightly older and less powerful version of gpt-4-turbo. I wanted to compare the performance and behavior of the different AI models in the game.

Here are some of the results and insights from the game:

  • Evan won the most games (11), followed by Jack (8), and Owen (2). Evan also won the most money (173 chips), followed by Jack (116 chips), and Owen (36 chips).
  • Evan and Owen had the same average winning amount (15.7 chips), while Jack had a lower average winning amount (14.5 chips). This suggests that Evan and Owen were more consistent and efficient in their betting strategies, while Jack was more variable and risky.
  • Evan, Owen, and Jack had different roles in the game: dealer, small blind, and big blind, respectively. The role of the agent affects the order of the actions and the amount of the bets. The dealer has the advantage of acting last in each betting round, while the small blind and the big blind have to act first and second, respectively. The small blind and the big blind also have to pay a fixed amount of chips before the cards are dealt, which reduces their stack size.
  • The role of the agent also affects the number and the amount of the victories. Evan won the most games as the dealer (7), followed by the small blind (3), and the big blind (1). Owen won the most games as the big blind (2), followed by the dealer (0), and the small blind (0). Jack won the most games as the big blind (5), followed by the dealer (3), and the small blind (0).
  • The role of the agent also affects the errors and the reasoning of the actions. An error is a situation where the agent tries to bet more chips than it has, which is not allowed by the rules of the game. A reasoning is a text explanation of the agent’s action, based on its hand, the community cards, the bet history, and the model of the environment. The agent uses its AI model to generate the reasoning, which can be helpful for understanding and improving its strategy.
  • Evan made one error in the game, while Jack made eight errors. Owen did not make any errors. The errors were mainly caused by the agent trying to bet more chips than it had in the pre-flop or the flop stage, which indicates a lack of awareness or adaptation to the stack size and the bet size. The errors also affected the outcome of the game, as the agent had to forfeit the pot or fold the hand.
  • The reasoning of the agents varied in length, clarity, and logic. Some of the reasoning were short and simple, while others were long and complex. Some of the reasoning were clear and coherent, while others were vague and inconsistent. Some of the reasoning were logical and rational, while others were illogical and irrational. The quality of the reasoning depended on the AI model, the hand, the community cards, the bet history, and the model of the environment.

Role of AI agents help you play poker or automate your business processes

AI agents can help you play poker or automate your business processes in several ways, such as:

Learning from data and feedback

AI agents can learn from their own actions and the outcomes of the game, as well as from the actions and the feedback of other agents or humans. They can use different learning methods, such as reinforcement learning, supervised learning, or unsupervised learning, to improve their performance and knowledge over time.

Adapting to changing situations

AI agents can adapt to changing situations, such as new or unknown cards, bets, opponents, or rules. They can use their model of the environment to predict the consequences of their actions and choose the best one. They can also update their model of the environment based on new information or evidence.

Interacting with other agents or humans

AI agents can interact with other agents or humans, such as cooperating, competing, negotiating, or communicating. They can use natural language processing to generate and understand text or speech, and they can use computer vision to recognize and interpret images or videos. They can also use social skills, such as empathy, persuasion, or humor, to influence or entertain others.

Explaining their actions and strategies

AI agents can explain their actions and strategies, such as why they did something, how they did something, or what they plan to do next. They can use natural language generation to produce text explanations, or they can use visualizations or diagrams to illustrate their explanations. They can also use reasoning modules to provide logical arguments or evidence for their explanations.

How do the agents think and reason?

One of the most fascinating aspects of using AI agents to play poker is how they think and reason about the game. Unlike humans, who rely on intuition, experience, and emotions, AI agents use logic, probability, and optimization to make their decisions. They also use natural language processing and generation to communicate and explain their actions. In this section, I will show you some examples of how the agents think and reason in different situations.

Pre-flop stage

In this stage, the agents only have their two hidden cards and no community cards. They have to decide whether to fold, call, or raise based on their hand strength, their position, and their stack size. They also have to consider the actions of the other players and their possible hands. For example, here is how Evan reasoned in one of the games:

You are Evan and you have Ace of Hearts and King of Diamonds in your hand. The board is empty. The bet history is [(Owen, 1, Small Blind, Pre-flop), (Jack, 2, Big Blind, Pre-flop)]. The current bet is 2. You have 10 chips left. It is your turn to act. What do you do and why?

The hand Ace of Hearts and King of Diamonds is one of the strongest starting hands in poker, especially in a three-player game. It has a high probability of making a pair, a straight, or a flush. Raising the current bet of 2 chips is a good move, as it shows confidence and puts pressure on the other players. The decision to raise also takes into account the position and the stack size. As the dealer, I have the advantage of acting last in each betting round, which gives me more information and options. As the shallowest stack, I have to be aggressive and try to win the pot quickly, before the blinds increase and reduce my chances of survival.

Flop stage

In this stage, the agents have their two hidden cards and three community cards. They have to decide whether to check, bet, call, raise, or fold based on their hand strength, their pot odds, and their implied odds. They also have to consider the actions of the other players and their possible hands. For example, here is how Jack reasoned in one of the games:

You are Jack and you have 9 of Clubs and 10 of Clubs in your hand. The board is 6 of Clubs, 7 of Clubs, and 8 of Diamonds. The bet history is [(Owen, 1, Small Blind, Pre-flop), (Jack, 2, Big Blind, Pre-flop), (Evan, 4, Raise, Pre-flop), (Owen, 4, Call, Pre-flop), (Jack, 4, Call, Pre-flop)]. The current bet is 0. You have 8 chips left. It is your turn to act. What do you do and why?

The hand 9 of Clubs and 10 of Clubs is a very strong hand in this situation, as it has made a straight with the board. It also has a flush draw, which means it can improve to a better hand if another club comes on the turn or the river. Betting the pot (12 chips) is a good move, as it maximizes the value of the hand and protects it from possible draws. The decision to bet also takes into account the position and the stack size. As the big blind, I have to act first in each betting round, which puts me at a disadvantage. As the second shallowest stack, I have to be aggressive and try to win the pot quickly, before the blinds increase and reduce my chances of survival.

Turn stage

In this stage, the agents have their two hidden cards and four community cards. They have to decide whether to check, bet, call, raise, or fold based on their hand strength, their pot odds, and their implied odds. They also have to consider the actions of the other players and their possible hands. For example, here is how Owen reasoned in one of the games:

You are Owen and you have 2 of Hearts and 3 of Hearts in your hand. The board is Jack of Hearts, Queen of Hearts, King of Hearts, and 4 of Spades. The bet history is [(Owen, 1, Small Blind, Pre-flop), (Jack, 2, Big Blind, Pre-flop), (Evan, 2, Call, Pre-flop), (Owen, 2, Call, Pre-flop), (Jack, 2, Check, Pre-flop), (Evan, 4, Bet, Flop), (Owen, 4, Call, Flop), (Jack, 4, Fold, Flop)]. The current bet is 0. You have 5 chips left. It is your turn to act. What do you do and why?

The hand 2 of Hearts and 3 of Hearts is a weak hand in this situation, as it has only a low pair and a low flush draw. It is unlikely to win against the other players’ hands, which could have a higher pair, a higher flush, or a straight. Folding the hand is a good move, as it saves the chips for a better opportunity. The decision to fold also takes into account the position and the stack size. As the small blind, I have to act second in each betting round, which gives me some information and options. As the deepest stack, I have to be conservative and try to survive until the other players run out of chips or make a mistake.

River stage

In this stage, the agents have their two hidden cards and five community cards. They have to decide whether to check, bet, call, raise, or fold based on their hand strength, their pot odds, and their implied odds. They also have to consider the actions of the other players and their possible hands. For example, here is how Evan reasoned in one of the games:

You are Evan and you have Jack of Spades and 10 of Spades in your hand. The board is 9 of Spades, Queen of Spades, King of Spades, 2 of Diamonds, and 5 of Diamonds. The bet history is [(Owen, 1, Small Blind, Pre-flop), (Jack, 2, Big Blind, Pre-flop), (Evan, 4, Raise, Pre-flop), (Owen, 4, Call, Pre-flop), (Jack, 4, Call, Pre-flop), (Evan, 8, Bet, Flop), (Owen, 8, Call, Flop), (Jack, 8, Fold, Flop), (Evan, 16, Bet, Turn), (Owen, 16, Call, Turn)]. The current bet is 0. You have 2 chips left. It is your turn to act. What do you do and why?

The hand Jack of Spades and 10 of Spades is a very strong hand in this situation, as it has made a royal flush with the board. A royal flush is the highest possible hand in poker, and it cannot be beaten by any other hand. Betting all-in (2 chips) is a good move, as it tries to extract the maximum value from the hand and hopes that the other player will call. The decision to bet also takes into account the position and the stack size. As the dealer, I have the advantage of acting last in each betting round, which gives me more information and options. As the shallowest stack, I have nothing to lose and everything to gain by betting all-in.

As you can see, the agents think and reason differently depending on their AI model, their hand, the board, the bet history, and the game stage. They also use different styles and levels of detail to explain their actions. By analyzing their reasoning, you can learn more about their strengths and weaknesses, and how to exploit or improve them. You can also compare and contrast their reasoning with your own, and see how you can improve your poker skills or automate your business processes.

Conclusion

In this blog post, I showed you how I wrote a code that uses AI agents to play Texas Hold’em Poker, a game that involves two hidden cards for each player and five community cards. I also explained how these agents work, what are their advantages and limitations, and how they can help you improve your poker skills or automate your business processes. I hope you enjoyed reading this post and learned something new. If you want to try the code or see the data, you can find them here: link.

--

--

Niko Gamulin
Niko Gamulin

Written by Niko Gamulin

Scientist. Engineer. Sportsman. Trying to understand what’s going on behind the scenes.