POMDPPlanners.environments.pacman_pomdp package

PacMan POMDP package with sprite-based visualization.

class POMDPPlanners.environments.pacman_pomdp.PacManObservationModel(next_state, action, pomdp)[source]

Bases: ObservationModel

Observation model for PacMan POMDP.

Parameters:

next_state (PacManState)
action (int)
pomdp (PacManPOMDP)

probability(values)[source]

Calculate observation probabilities for multi-ghost observations.

Return type:: ndarray
Parameters:: values (List[Tuple[Tuple[int, int], ...]])

sample(n_samples=1)[source]

Sample observations of all ghost positions with noise.

Return type:: List[Tuple[Tuple[int, int], ...]]
Parameters:: n_samples (int)

sample_closest_ghosts(max_ghosts=2, n_samples=1)[source]

Sample observations of only the closest ghosts.

Return type:

List[Tuple[Tuple[int, int], ...]]

Parameters:

max_ghosts (int)
n_samples (int)

class POMDPPlanners.environments.pacman_pomdp.PacManPOMDP(maze_size=(7, 7), walls=None, initial_pellets=None, initial_pacman_pos=(0, 0), num_ghosts=1, initial_ghost_positions=None, initial_ghost_pos=None, pellet_reward=10.0, ghost_collision_penalty=-100.0, step_penalty=-1.0, win_reward=100.0, ghost_aggressiveness=2.0, ghost_coordination='independent', ghost_strategies=None, observation_noise_factor=0.3, max_observation_noise=1.5, discount_factor=0.95, name='PacManPOMDP', output_dir=None, debug=False)[source]

Bases: DiscreteActionsEnvironment

PacMan POMDP environment inspired by the classic arcade game.

This environment implements a simplified PacMan game where PacMan must collect pellets while avoiding a single ghost. The ghost position is only partially observable through noisy sensor readings.

Parameters:

maze_size (Tuple[int, int])
walls (Set[Tuple[int, int]] | None)
initial_pellets (List[Tuple[int, int]] | None)
initial_pacman_pos (Tuple[int, int])
num_ghosts (int)
initial_ghost_positions (List[Tuple[int, int]] | None)
initial_ghost_pos (Tuple[int, int] | None)
pellet_reward (float)
ghost_collision_penalty (float)
step_penalty (float)
win_reward (float)
ghost_aggressiveness (float)
ghost_coordination (str)
ghost_strategies (List[str] | None)
observation_noise_factor (float)
max_observation_noise (float)
discount_factor (float)
name (str)
output_dir (Path | None)
debug (bool)

maze_size: Grid dimensions as (rows, cols)

walls: Set of wall positions as (row, col) tuples

initial_pellets: List of initial pellet positions

pellet_reward: Reward for collecting a pellet

ghost_collision_penalty: Penalty for collision with ghost

step_penalty: Cost per action

win_reward: Reward for collecting all pellets

ghost_aggressiveness: Temperature parameter for ghost movement policy

observation_noise_factor: Multiplier for observation noise based on distance

max_observation_noise: Maximum noise standard deviation

Example

>>> import numpy as np
>>> np.random.seed(42)  # For reproducible results
>>>
>>> # Initialize environment
>>> env = PacManPOMDP(maze_size=(7, 7))
>>>
>>> # Get initial state and actions
>>> initial_state = env.initial_state_dist().sample()[0]
>>> actions = env.get_actions()
>>>
>>> # Sample complete step using convenience method
>>> action = actions[0]
>>> next_state, observation, reward = env.sample_next_step(initial_state, action)
>>>
>>> # Check terminal condition
>>> env.is_terminal(initial_state)
False

array_to_observation(arr)[source]

Convert a flat numpy array back to a PacMan observation tuple.

Parameters:: arr (ndarray) – 1-D array of shape (2 * num_ghosts,).
Return type:: Tuple[Tuple[int, int], ...]
Returns:: Observation as tuple of (row, col) tuples.

array_to_state(arr)[source]

Convert a numpy array back to a PacManState.

Parameters:: arr (ndarray) – 1-D array of shape (self._state_dim,) produced by state_to_array().
Return type:: PacManState
Returns:: Reconstructed PacManState.

cache_visualization(history, cache_path)[source]

Cache visualization of episode history.

Parameters:

history (List[StepData]) – List of StepData objects representing the episode
cache_path (Path) – Path where the GIF should be saved

Return type:

None

compute_metrics(histories)[source]

Compute environment-specific metrics.

Return type:: List[MetricValue]
Parameters:: histories (List[History])

get_actions()[source]

Get all available actions.

Return type:: List[int]

get_metric_names()[source]

Get names of PacMan POMDP specific metrics.

Return type:: List[str]
Returns:: List containing metric names including standard metrics (win_rate, avg_pellets_collected, avg_episode_length, avg_pacman_closest_ghost_distance, avg_collision_encounters) and dynamically generated per-ghost distance metrics for multi-ghost scenarios (avg_pacman_ghost_0_distance, avg_pacman_ghost_1_distance, etc.)

property initial_ghost_pos: Tuple[int, int]

returns first ghost position.

Type:: Backward compatibility

initial_observation_dist()[source]

Get initial observation distribution.

Return type:: DiscreteDistribution

initial_state_dist()[source]

Get initial state distribution.

Return type:: DiscreteDistribution

is_equal_observation(observation1, observation2)[source]

Check if two observations are equal.

Return type:

bool

Parameters:

observation1 (Any)
observation2 (Any)

is_terminal(state)[source]

Check if state is terminal.

Return type:: bool
Parameters:: state (Any)

observation_model(next_state, action)[source]

Get observation model.

Return type:

PacManObservationModel

Parameters:

next_state (Any)
action (int)

observation_to_array(obs)[source]

Convert a PacMan observation tuple to a flat numpy array.

Parameters:: obs (Tuple[Tuple[int, int], ...]) – Observation as tuple of ghost (row, col) positions.
Return type:: ndarray
Returns:: 1-D array of shape (2 * num_ghosts,).

reward(state, action)[source]

Calculate immediate reward.

Return type:

float

Parameters:

state (Any)
action (int)

reward_batch(states, action)[source]

Calculate rewards for a batch of states.

Accepts either a 2-D numpy array of shape (N, state_dim) (vectorized path) or a sequence of PacManState objects (falls back to the loop-based default).

Computes deterministic reward components only: step penalty, pellet collection, and win bonus. Ghost collision penalty is excluded because it depends on stochastic ghost movement.

Parameters:

states (Union[ndarray, Sequence[Any]]) – Array of shape (N, state_dim) or sequence of states.
action (int) – Discrete action index (0-3).

Return type:

ndarray

Returns:

1-D array of reward values with shape (N,).

state_to_array(state)[source]

Convert a PacManState to a fixed-size numpy array.

The array layout is: [pac_row, pac_col, g0_row, g0_col, ..., pellet_mask[0..P-1], score, terminal]

Parameters:: state (PacManState) – A PacManState instance.
Return type:: ndarray
Returns:: 1-D float array of shape (self._state_dim,).

state_transition_model(state, action)[source]

Get state transition model.

Return type:

PacManStateTransitionModel

Parameters:

state (Any)
action (int)

states_to_array(states)[source]

Batch-convert a list of PacManState to a 2-D numpy array.

Parameters:: states (List[PacManState]) – List of PacManState instances.
Return type:: ndarray
Returns:: Array of shape (len(states), self._state_dim).

visualize_path(path, actions, cache_path)[source]

Visualize PacMan path through the maze using sprite-based rendering.

Parameters:

path (List[PacManState]) – List of states representing the path through the maze
actions (List[int]) – List of actions taken at each step
cache_path (Path) – Path where the GIF should be saved

class POMDPPlanners.environments.pacman_pomdp.PacManState(pacman_pos, ghost_positions, pellets, score=0, terminal=False)[source]

Bases: object

State representation for PacMan POMDP.

Parameters:

pacman_pos (Tuple[int, int])
ghost_positions (Tuple[Tuple[int, int], ...])
pellets (Tuple[Tuple[int, int], ...])
score (int | float)
terminal (bool)

pacman_pos: PacMan position as (row, col) tuple

ghost_positions: Tuple of ghost positions as (row, col) tuples

pellets: Tuple of remaining pellet positions as (row, col) tuples

score: Current game score

terminal: Whether the game has ended

property ghost_pos: Tuple[int, int]

returns first ghost position.

Type:: Backward compatibility

ghost_positions: Tuple[Tuple[int, int], ...]

property num_ghosts: int: Number of ghosts in the game.

pacman_pos: Tuple[int, int]

pellets: Tuple[Tuple[int, int], ...]

score: int | float = 0

terminal: bool = False

class POMDPPlanners.environments.pacman_pomdp.PacManStateTransitionModel(state, action, pomdp)[source]

Bases: StateTransitionModel

State transition model for PacMan POMDP.

Parameters:

state (PacManState)
action (int)
pomdp (PacManPOMDP)

probability(values)[source]

Calculate transition probabilities to next states.

Parameters:: values (List[PacManState]) – List of potential next states
Return type:: ndarray
Returns:: Array of probabilities for each state in values

sample(n_samples=1)[source]

Sample next states.

Return type:: List[PacManState]
Parameters:: n_samples (int)

class POMDPPlanners.environments.pacman_pomdp.PacManVectorizedUpdater(maze_size, num_ghosts, num_pellets, state_dim, neighbor_table, neighbor_validity, pellet_positions, ghost_aggressiveness, ghost_coordination, ghost_strategies, observation_noise_factor, max_observation_noise, idx_pac_row, idx_pac_col, idx_ghosts_start, idx_pellets_start, idx_pellets_end, idx_score, idx_terminal)[source]

Bases: VectorizedParticleBeliefUpdater

Vectorized particle belief updater for PacMan POMDP.

Performs all-particle transitions and observation log-likelihood evaluations using vectorized NumPy operations. Ghost movement uses batched softmax sampling, and collision/pellet logic operates on the full particle array at once.

Parameters:

maze_size (Tuple[int, int])
num_ghosts (int)
num_pellets (int)
state_dim (int)
neighbor_table (np.ndarray)
neighbor_validity (np.ndarray)
pellet_positions (np.ndarray)
ghost_aggressiveness (float)
ghost_coordination (str)
ghost_strategies (List[str])
observation_noise_factor (float)
max_observation_noise (float)
idx_pac_row (int)
idx_pac_col (int)
idx_ghosts_start (int)
idx_pellets_start (int)
idx_pellets_end (int)
idx_score (int)
idx_terminal (int)

maze_size: Grid dimensions (rows, cols).

num_ghosts: Number of ghosts.

num_pellets: Number of initial pellets.

state_dim: Dimensionality of the array state.

ghost_aggressiveness: Softmax temperature for ghost pursuit.

ghost_coordination: Ghost coordination mode.

ghost_strategies: Per-ghost strategy list.

observation_noise_factor: Multiplier for observation noise.

max_observation_noise: Maximum observation noise std.

batch_observation_log_likelihood(next_particles, action, observation)[source]

Compute observation log-likelihoods for all particles at once.

Parameters:

next_particles (ndarray) – Transitioned particle states of shape (N, d).
action (ndarray) – Action vector.
observation (ndarray) – Observed value.

Return type:

ndarray

Returns:

Log-likelihoods of shape (N,).

batch_transition(particles, action)[source]

Transition all particles in a single batched operation.

Parameters:

particles (ndarray) – Current particle states of shape (N, d).
action (ndarray) – Action vector.

Return type:

ndarray

Returns:

Next-state particles of shape (N, d).

property config_id: str: Return a deterministic identifier for this updater configuration.

classmethod from_environment(env)[source]

Construct an updater from a PacManPOMDP instance.

Parameters:: env (PacManPOMDP) – Environment to extract parameters from.
Return type:: PacManVectorizedUpdater
Returns:: A new PacManVectorizedUpdater instance.

POMDPPlanners.environments.pacman_pomdp.create_pacman_belief(env, belief_type=BeliefType.VECTORIZED_PARTICLE, n_particles=200, **kwargs)[source]

Create a ready-to-use belief for the PacMan POMDP.

Parameters:

env (PacManPOMDP) – PacManPOMDP environment instance.
belief_type (BeliefType) – Desired belief representation. Defaults to BeliefType.VECTORIZED_PARTICLE.
n_particles (int) – Number of particles. Defaults to 200.
**kwargs (Any) – Extra arguments (reserved for future use).

Return type:

Belief

Returns:

A configured Belief object.

Raises:

ValueError – If belief_type is not supported.

Example

>>> import numpy as np
>>> np.random.seed(42)
>>> from POMDPPlanners.environments.pacman_pomdp import PacManPOMDP
>>> env = PacManPOMDP(discount_factor=0.95)
>>> belief = create_pacman_belief(env, n_particles=50)
>>> belief.sample().shape[0] > 0
True

POMDPPlanners.environments.pacman_pomdp.create_simple_maze_pacman(maze_size=7, num_walls=5, num_ghosts=1, seed=None)[source]

Create a simple PacMan instance with random walls and multiple ghosts.

Parameters:

maze_size (int) – Size of square maze. Defaults to 7.
num_walls (int) – Number of walls to place randomly. Defaults to 5.
num_ghosts (int) – Number of ghosts in the game. Defaults to 1.
seed (Optional[int]) – Random seed. Defaults to None.

Return type:

PacManPOMDP

Returns:

Randomly configured PacMan POMDP with multi-ghost support

Subpackages

POMDPPlanners.environments.pacman_pomdp.pacman_pomdp_beliefs package

Submodules

POMDPPlanners.environments.pacman_pomdp.pacman_pomdp module

Module for PacMan POMDP environment.

This module provides the PacMan POMDP environment implementation inspired by the classic arcade game. The environment features a grid world where PacMan must collect pellets while avoiding ghosts, with partial observability of ghost positions.

The environment involves PacMan navigating a maze with walls, collecting pellets, and avoiding ghosts that move according to stochastic policies. PacMan receives noisy observations about nearby ghost positions.

Classes:: PacManState: Represents the state of the environment PacManPOMDP: The main POMDP environment implementation

class POMDPPlanners.environments.pacman_pomdp.pacman_pomdp.PacManObservationModel(next_state, action, pomdp)[source]

Bases: ObservationModel

Observation model for PacMan POMDP.

Parameters:

next_state (PacManState)
action (int)
pomdp (PacManPOMDP)

probability(values)[source]

Calculate observation probabilities for multi-ghost observations.

Return type:: ndarray
Parameters:: values (List[Tuple[Tuple[int, int], ...]])

sample(n_samples=1)[source]

Sample observations of all ghost positions with noise.

Return type:: List[Tuple[Tuple[int, int], ...]]
Parameters:: n_samples (int)

sample_closest_ghosts(max_ghosts=2, n_samples=1)[source]

Sample observations of only the closest ghosts.

Return type:

List[Tuple[Tuple[int, int], ...]]

Parameters:

max_ghosts (int)
n_samples (int)

class POMDPPlanners.environments.pacman_pomdp.pacman_pomdp.PacManPOMDP(maze_size=(7, 7), walls=None, initial_pellets=None, initial_pacman_pos=(0, 0), num_ghosts=1, initial_ghost_positions=None, initial_ghost_pos=None, pellet_reward=10.0, ghost_collision_penalty=-100.0, step_penalty=-1.0, win_reward=100.0, ghost_aggressiveness=2.0, ghost_coordination='independent', ghost_strategies=None, observation_noise_factor=0.3, max_observation_noise=1.5, discount_factor=0.95, name='PacManPOMDP', output_dir=None, debug=False)[source]

Bases: DiscreteActionsEnvironment

PacMan POMDP environment inspired by the classic arcade game.

This environment implements a simplified PacMan game where PacMan must collect pellets while avoiding a single ghost. The ghost position is only partially observable through noisy sensor readings.

Parameters:

maze_size (Tuple[int, int])
walls (Set[Tuple[int, int]] | None)
initial_pellets (List[Tuple[int, int]])
initial_pacman_pos (Tuple[int, int])
num_ghosts (int)
initial_ghost_positions (List[Tuple[int, int]] | None)
initial_ghost_pos (Tuple[int, int] | None)
pellet_reward (float)
ghost_collision_penalty (float)
step_penalty (float)
win_reward (float)
ghost_aggressiveness (float)
ghost_coordination (str)
ghost_strategies (List[str] | None)
observation_noise_factor (float)
max_observation_noise (float)
discount_factor (float)
name (str)
output_dir (Path | None)
debug (bool)

maze_size: Grid dimensions as (rows, cols)

walls: Set of wall positions as (row, col) tuples

initial_pellets: List of initial pellet positions

pellet_reward: Reward for collecting a pellet

ghost_collision_penalty: Penalty for collision with ghost

step_penalty: Cost per action

win_reward: Reward for collecting all pellets

ghost_aggressiveness: Temperature parameter for ghost movement policy

observation_noise_factor: Multiplier for observation noise based on distance

max_observation_noise: Maximum noise standard deviation

Example

>>> import numpy as np
>>> np.random.seed(42)  # For reproducible results
>>>
>>> # Initialize environment
>>> env = PacManPOMDP(maze_size=(7, 7))
>>>
>>> # Get initial state and actions
>>> initial_state = env.initial_state_dist().sample()[0]
>>> actions = env.get_actions()
>>>
>>> # Sample complete step using convenience method
>>> action = actions[0]
>>> next_state, observation, reward = env.sample_next_step(initial_state, action)
>>>
>>> # Check terminal condition
>>> env.is_terminal(initial_state)
False

array_to_observation(arr)[source]

Convert a flat numpy array back to a PacMan observation tuple.

Parameters:: arr (ndarray) – 1-D array of shape (2 * num_ghosts,).
Return type:: Tuple[Tuple[int, int], ...]
Returns:: Observation as tuple of (row, col) tuples.

array_to_state(arr)[source]

Convert a numpy array back to a PacManState.

Parameters:: arr (ndarray) – 1-D array of shape (self._state_dim,) produced by state_to_array().
Return type:: PacManState
Returns:: Reconstructed PacManState.

cache_visualization(history, cache_path)[source]

Cache visualization of episode history.

Parameters:

history (List[StepData]) – List of StepData objects representing the episode
cache_path (Path) – Path where the GIF should be saved

Return type:

None

compute_metrics(histories)[source]

Compute environment-specific metrics.

Return type:: List[MetricValue]
Parameters:: histories (List[History])

get_actions()[source]

Get all available actions.

Return type:: List[int]

get_metric_names()[source]

Get names of PacMan POMDP specific metrics.

Return type:: List[str]
Returns:: List containing metric names including standard metrics (win_rate, avg_pellets_collected, avg_episode_length, avg_pacman_closest_ghost_distance, avg_collision_encounters) and dynamically generated per-ghost distance metrics for multi-ghost scenarios (avg_pacman_ghost_0_distance, avg_pacman_ghost_1_distance, etc.)

property initial_ghost_pos: Tuple[int, int]

returns first ghost position.

Type:: Backward compatibility

initial_observation_dist()[source]

Get initial observation distribution.

Return type:: DiscreteDistribution

initial_pellets: List[Tuple[int, int]]

initial_state_dist()[source]

Get initial state distribution.

Return type:: DiscreteDistribution

is_equal_observation(observation1, observation2)[source]

Check if two observations are equal.

Return type:

bool

Parameters:

observation1 (Any)
observation2 (Any)

is_terminal(state)[source]

Check if state is terminal.

Return type:: bool
Parameters:: state (Any)

observation_model(next_state, action)[source]

Get observation model.

Return type:

PacManObservationModel

Parameters:

next_state (Any)
action (int)

observation_to_array(obs)[source]

Convert a PacMan observation tuple to a flat numpy array.

Parameters:: obs (Tuple[Tuple[int, int], ...]) – Observation as tuple of ghost (row, col) positions.
Return type:: ndarray
Returns:: 1-D array of shape (2 * num_ghosts,).

reward(state, action)[source]

Calculate immediate reward.

Return type:

float

Parameters:

state (Any)
action (int)

reward_batch(states, action)[source]

Calculate rewards for a batch of states.

Accepts either a 2-D numpy array of shape (N, state_dim) (vectorized path) or a sequence of PacManState objects (falls back to the loop-based default).

Computes deterministic reward components only: step penalty, pellet collection, and win bonus. Ghost collision penalty is excluded because it depends on stochastic ghost movement.

Parameters:

states (Union[ndarray, Sequence[Any]]) – Array of shape (N, state_dim) or sequence of states.
action (int) – Discrete action index (0-3).

Return type:

ndarray

Returns:

1-D array of reward values with shape (N,).

state_to_array(state)[source]

Convert a PacManState to a fixed-size numpy array.

The array layout is: [pac_row, pac_col, g0_row, g0_col, ..., pellet_mask[0..P-1], score, terminal]

Parameters:: state (PacManState) – A PacManState instance.
Return type:: ndarray
Returns:: 1-D float array of shape (self._state_dim,).

state_transition_model(state, action)[source]

Get state transition model.

Return type:

PacManStateTransitionModel

Parameters:

state (Any)
action (int)

states_to_array(states)[source]

Batch-convert a list of PacManState to a 2-D numpy array.

Parameters:: states (List[PacManState]) – List of PacManState instances.
Return type:: ndarray
Returns:: Array of shape (len(states), self._state_dim).

visualize_path(path, actions, cache_path)[source]

Visualize PacMan path through the maze using sprite-based rendering.

Parameters:

path (List[PacManState]) – List of states representing the path through the maze
actions (List[int]) – List of actions taken at each step
cache_path (Path) – Path where the GIF should be saved

class POMDPPlanners.environments.pacman_pomdp.pacman_pomdp.PacManPOMDPMetrics(*values)[source]

Bases: Enum

Metric names for PacMan POMDP environment.

AVG_COLLISION_ENCOUNTERS = 'avg_collision_encounters'

AVG_EPISODE_LENGTH = 'avg_episode_length'

AVG_PACMAN_CLOSEST_GHOST_DISTANCE = 'avg_pacman_closest_ghost_distance'

AVG_PELLETS_COLLECTED = 'avg_pellets_collected'

WIN_RATE = 'win_rate'

class POMDPPlanners.environments.pacman_pomdp.pacman_pomdp.PacManState(pacman_pos, ghost_positions, pellets, score=0, terminal=False)[source]

Bases: object

State representation for PacMan POMDP.

Parameters:

pacman_pos (Tuple[int, int])
ghost_positions (Tuple[Tuple[int, int], ...])
pellets (Tuple[Tuple[int, int], ...])
score (int | float)
terminal (bool)

pacman_pos: PacMan position as (row, col) tuple

ghost_positions: Tuple of ghost positions as (row, col) tuples

pellets: Tuple of remaining pellet positions as (row, col) tuples

score: Current game score

terminal: Whether the game has ended

property ghost_pos: Tuple[int, int]

returns first ghost position.

Type:: Backward compatibility

ghost_positions: Tuple[Tuple[int, int], ...]

property num_ghosts: int: Number of ghosts in the game.

pacman_pos: Tuple[int, int]

pellets: Tuple[Tuple[int, int], ...]

score: int | float = 0

terminal: bool = False

class POMDPPlanners.environments.pacman_pomdp.pacman_pomdp.PacManStateTransitionModel(state, action, pomdp)[source]

Bases: StateTransitionModel

State transition model for PacMan POMDP.

Parameters:

state (PacManState)
action (int)
pomdp (PacManPOMDP)

probability(values)[source]

Calculate transition probabilities to next states.

Parameters:: values (List[PacManState]) – List of potential next states
Return type:: ndarray
Returns:: Array of probabilities for each state in values

sample(n_samples=1)[source]

Sample next states.

Return type:: List[PacManState]
Parameters:: n_samples (int)

POMDPPlanners.environments.pacman_pomdp.pacman_pomdp.create_simple_maze_pacman(maze_size=7, num_walls=5, num_ghosts=1, seed=None)[source]

Create a simple PacMan instance with random walls and multiple ghosts.

Parameters:

maze_size (int) – Size of square maze. Defaults to 7.
num_walls (int) – Number of walls to place randomly. Defaults to 5.
num_ghosts (int) – Number of ghosts in the game. Defaults to 1.
seed (Optional[int]) – Random seed. Defaults to None.

Return type:

PacManPOMDP

Returns:

Randomly configured PacMan POMDP with multi-ghost support

POMDPPlanners.environments.pacman_pomdp.pacman_visualizer module

Visualization module for PacMan POMDP environment.

This module provides sprite-based visualization capabilities for PacMan POMDP episodes, rendering animated GIFs of agent behavior and game state.

Classes:: PacManVisualizer: Handles sprite-based rendering and GIF generation

class POMDPPlanners.environments.pacman_pomdp.pacman_visualizer.PacManVisualizer(environment, tile_size=32)[source]

Bases: object

Handles visualization for PacMan POMDP environments.

This class manages sprite loading, frame rendering, and GIF generation for visualizing PacMan POMDP episodes. It renders the maze, PacMan, ghosts, pellets, and game state information.

Parameters:

environment (PacManPOMDP)
tile_size (int)

env: Reference to the PacMan POMDP environment

tile_size: Size of each tile in pixels

sprites: Dictionary of loaded sprite images

cache_visualization(history, cache_path)[source]

Cache visualization of episode history.

Parameters:

history (List[StepData]) – List of StepData objects representing the episode
cache_path (Path) – Path where the GIF should be saved

Raises:

TypeError – If history or cache_path have wrong types
ValueError – If history is empty or cache_path doesn’t end with .gif

Return type:

None

visualize_path(path, actions, cache_path)[source]

Visualize PacMan path through the maze using sprite-based rendering.

Parameters:

path (List[PacManState]) – List of states representing the path through the maze
actions (List[int]) – List of actions taken at each step
cache_path (Path) – Path where the GIF should be saved

Raises:

TypeError – If cache_path is not a Path object

Return type:

None