✨Tips and tricks
Here are some tips and tricks for creating your bot
The Manhattan Distance is a useful way to calculate distances on a grid-based map such as this one:
# returns the manhattan distance between two tiles, calculated as:
# |x1 - x2| + |y1 - y2|
def manhattan_distance(self, start, end):
distance = abs(start - end) + abs(start - end)
In case you're looking to visualise the entire game map in some form (e.g. as an array), here's an example implementation:
# make sure you've imported numpy at the start of your script (import numpy as np)
# print the game map as an array using numpy
def print_map(self, game_state):
# note the y-index will be flipped
# since numpy arrays index from top left as (0,0)
# whereas our map follows a cartesian coordinate system with bottom left as (0,0)
cols = game_state.size
rows = game_state.size
game_map = np.zeros((rows, cols)).astype(str)
for x in range(cols):
for y in range(rows):
entity = game_state.entity_at((x,y))
if entity is not None:
game_map[y][x] = entity
game_map[y][x] = 9 # using 9 here as 'free' since 0 = Player 1
# thank you: @ifitaintbroke for providing a fixed version
Due to a limitation in the way Agents interact with the game environment, there may be a lag in the update of the
game_stateyour Agent receives on each 'tick'. For example, when an Agent produces an action, the effects of this action may not become observable in the next 'tick', but will be updated in the following 'tick' after that.
As a workaround, you can pre-plan and store your Agent's action, e.g.:
Then have your Agent choose the next planned action (e.g. using
action = planned_actions.pop()) after checking whether the game state has updated.
One method is to check whether the current
game_state.tick_numberis greater than the previous
tick_numberby a certain delay threshold (e.g. one or two ticks) before sending your next move.
Another method is using
game_state.entity_at(my_location)to check whether your action has been executed (e.g. if
game_state.entity_at(my_location) = 'p', then you know your Agent has successfully placed its bomb).
Since your Agent is a class object, instead of returning one action at a time, you can also store and pre-plan a list of moves in one go. As the game executes, you can then tell it to choose from your pre-planned set of moves instead of having to process a new move.
For example to store a move:
Then to use your planned action:
# if we have actions stored, we'll execute this first
action = self.planned_actions.pop()
# do stuff here to plan your next action
This can be useful to help you navigate to specific objects across the Game map, or in a workaround to any
game_stateupdate syncing issues (see Tips and Tricks note above).
Monte Carlo Tree Search A Python implementation
Reinforcement Learning: An Introduction Sutton & Barto (2017)
Our own Tabular Q-Learning Tutorial (using OpenAI Gym's Taxi)