POMDPPlanners.configs package

class POMDPPlanners.configs.EnvironmentConfigsAPI(discount_factor=0.95, debug=False)[source]

Bases: object

Parameters:
cartpole_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

continuous_laser_tag_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

continuous_laser_tag_pomdp_discrete_actions_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

continuous_observations_continuous_actions_light_dark_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

continuous_observations_discrete_actions_light_dark_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

get_compatible_environments(policy_space_info, n_particles=20, seed=42)[source]

Get list of environments compatible with the given policy space info.

Parameters:
  • policy_space_info (PolicySpaceInfo) – Policy space information containing action and observation space types

  • n_particles (int) – Number of particles for belief initialization

  • seed (int) – Random seed for reproducible belief initialization

Return type:

List[Tuple[Environment, Belief]]

Returns:

List of tuples containing (environment, belief) pairs that are compatible with the policy

laser_tag_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

mountain_car_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

pacman_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

push_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

rock_sample_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, WeightedParticleBelief]

Parameters:

n_particles (int)

safety_ant_velocity_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

tiger_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

class POMDPPlanners.configs.PlannersHyperparamConfigs(discount_factor)[source]

Bases: object

Parameters:

discount_factor (float)

discrete_action_sequences_config(env, name)[source]
Return type:

HyperParamPlannerConfig

Parameters:
get_compatible_planners(env, time_out_in_seconds=3.0)[source]

Get all planners that are compatible with the given environment.

This function analyzes the environment’s space information and returns a list of configured planners that can solve this environment. The compatibility is determined by checking if the planner’s space requirements match the environment’s space types, following the logic from Policy._verify_environment_compatibility.

Parameters:
  • env (Environment) – The POMDP environment to find compatible planners for

  • time_out_in_seconds (float) – Time limit for each planner. Defaults to 3.0.

Return type:

List[HyperParamPlannerConfig]

Returns:

List of HyperParamPlannerConfig objects for compatible planners

Note

  • For discrete action spaces, uses DiscreteActionSampler

  • For continuous action spaces, uses UnitCircleActionSampler

  • Only includes planners that can handle the environment’s space types

pft_dpw_config(env, action_sampler, name, time_out_in_seconds=3.0)[source]
Return type:

HyperParamPlannerConfig

Parameters:
pomcp_config(env, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
pomcp_dpw_config(env, action_sampler, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
pomcpow_config(env, action_sampler, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
sparse_pft_config(env, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
sparse_sampling_config(env, name)[source]
Return type:

HyperParamPlannerConfig

Parameters:

Submodules

POMDPPlanners.configs.environment_configs module

class POMDPPlanners.configs.environment_configs.EnvironmentConfigsAPI(discount_factor=0.95, debug=False)[source]

Bases: object

Parameters:
cartpole_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

continuous_laser_tag_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

continuous_laser_tag_pomdp_discrete_actions_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

continuous_observations_continuous_actions_light_dark_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

continuous_observations_discrete_actions_light_dark_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

get_compatible_environments(policy_space_info, n_particles=20, seed=42)[source]

Get list of environments compatible with the given policy space info.

Parameters:
  • policy_space_info (PolicySpaceInfo) – Policy space information containing action and observation space types

  • n_particles (int) – Number of particles for belief initialization

  • seed (int) – Random seed for reproducible belief initialization

Return type:

List[Tuple[Environment, Belief]]

Returns:

List of tuples containing (environment, belief) pairs that are compatible with the policy

laser_tag_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

mountain_car_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

pacman_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

push_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

rock_sample_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, WeightedParticleBelief]

Parameters:

n_particles (int)

safety_ant_velocity_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

tiger_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

class POMDPPlanners.configs.environment_configs.RiskAverseEnvironmentConfigsAPI(discount_factor=0.95, debug=False)[source]

Bases: object

Parameters:
continuous_laser_tag_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

continuous_laser_tag_pomdp_discrete_actions_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

continuous_observations_continuous_actions_light_dark_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

continuous_observations_discrete_actions_light_dark_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, Belief]

Parameters:

n_particles (int)

get_compatible_environments(policy_space_info, n_particles=20, seed=42)[source]

Get list of environments compatible with the given policy space info.

Parameters:
  • policy_space_info (PolicySpaceInfo) – Policy space information containing action and observation space types

  • n_particles (int) – Number of particles for belief initialization

  • seed (int) – Random seed for reproducible belief initialization

Return type:

List[Tuple[Environment, Belief]]

Returns:

List of tuples containing (environment, belief) pairs that are compatible with the policy

laser_tag_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

pacman_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

push_pomdp_config(n_particles=20)[source]
Return type:

Tuple[DiscreteActionsEnvironment, WeightedParticleBelief]

Parameters:

n_particles (int)

rock_sample_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, WeightedParticleBelief]

Parameters:

n_particles (int)

safety_ant_velocity_pomdp_config(n_particles=20)[source]
Return type:

Tuple[Environment, Belief]

Parameters:

n_particles (int)

POMDPPlanners.configs.environment_configs.get_all_environments(n_particles=20, include_risk_averse=True)[source]

Get all environments from both standard and risk-averse API classes.

Parameters:
  • n_particles (int) – Number of particles for belief initialization

  • include_risk_averse (bool) – Whether to include environments from RiskAverseEnvironmentConfigsAPI

Return type:

List[Tuple[Environment, Belief]]

Returns:

List of tuples containing (environment, belief) pairs from all available configurations

POMDPPlanners.configs.environment_configs.get_compatible_environments(config_api_instance, policy_space_info, n_particles=20, seed=42)[source]

Get list of environments compatible with the given policy space info.

Parameters:
  • config_api_instance – Instance of EnvironmentConfigsAPI or its subclass

  • policy_space_info (PolicySpaceInfo) – Policy space information containing action and observation space types

  • n_particles (int) – Number of particles for belief initialization

  • seed (int) – Random seed for reproducible belief initialization

Return type:

List[Tuple[Environment, Belief]]

Returns:

List of tuples containing (environment, belief) pairs that are compatible with the policy

POMDPPlanners.configs.experiment_configs module

class POMDPPlanners.configs.experiment_configs.AllBenchmarkEnvironmentsOnPlannerGeneratorsExperimentConfigCreator(generators, n_particles, num_episodes, num_steps, is_risk_averse)[source]

Bases: EvaluationExperimentConfigCreator

Parameters:
class POMDPPlanners.configs.experiment_configs.AllHyperparameterBenchmarksExperimentConfigCreator(policy_space_info, particles, num_episodes, num_steps, n_trials, discount_factor, time_out_in_seconds, is_risk_averse, debug=False)[source]

Bases: HyperparameterOptimizationExperimentConfigCreator

Experiment configuration creator for all hyperparameter benchmarks.

This class creates hyperparameter optimization experiment configurations for all compatible environments and planners based on a given policy space. It automatically finds all environments that match the specified action and observation space types, and generates configurations for hyperparameter tuning experiments.

Parameters:
policy_space_info

Policy space information specifying action and observation space types for compatibility matching.

particles

Number of particles for belief representation.

num_episodes

Number of episodes for optimization.

num_steps

Maximum steps per episode for optimization.

n_trials

Number of optimization trials.

discount_factor

Discount factor for the MDP.

time_out_in_seconds

Timeout for planner execution.

is_risk_averse

Whether to use risk-averse optimization metrics.

parameter_to_optimize_mapper

Mapper for determining optimization parameters based on environment and risk-averse settings.

Example

>>> from POMDPPlanners.core.policy import PolicySpaceInfo
>>> from POMDPPlanners.core.environment import SpaceType
>>> from POMDPPlanners.configs.experiment_configs import (
...     AllHyperparameterBenchmarksExperimentConfigCreator
... )
>>>
>>> # Create policy space info for discrete environments
>>> space_info = PolicySpaceInfo(
...     action_space=SpaceType.DISCRETE,
...     observation_space=SpaceType.DISCRETE
... )
>>>
>>> # Create experiment config creator
>>> creator = AllHyperparameterBenchmarksExperimentConfigCreator(
...     policy_space_info=space_info,
...     particles=10,
...     num_episodes=2,
...     num_steps=3,
...     n_trials=5,
...     discount_factor=0.95,
...     time_out_in_seconds=3.0,
...     is_risk_averse=False
... )
>>>
>>> # Get experiment configurations
>>> configs = creator.get_experiment_configs()
>>>
>>> # Verify configurations
>>> len(configs) > 0
True
>>> all(config.num_episodes == 2 for config in configs)
True
>>> all(config.num_steps == 3 for config in configs)
True
>>> all(config.n_trials == 5 for config in configs)
True
class POMDPPlanners.configs.experiment_configs.AverageReturnParameterToOptimizeMapper[source]

Bases: ParameterToOptimizeMapper

generate(environment, policy_cls=None)[source]
Return type:

List[Tuple[str, HyperParameterOptimizationDirection]]

Parameters:
class POMDPPlanners.configs.experiment_configs.PolicyHyperparameterOptimizationExperimentConfigCreator(generators, particles, num_episodes, num_steps, n_trials, discount_factor, time_out_in_seconds, is_risk_averse, debug=False)[source]

Bases: HyperparameterOptimizationExperimentConfigCreator

Parameters:
class POMDPPlanners.configs.experiment_configs.RiskAverseParameterToOptimizeMapper[source]

Bases: ParameterToOptimizeMapper

generate(environment, policy_cls=None)[source]
Return type:

List[Tuple[str, HyperParameterOptimizationDirection]]

Parameters:
POMDPPlanners.configs.experiment_configs.complete_environments_and_benchmarks_hyperparameter_optimization_configs(generators, parameter_to_optimize_mapper, particles=30, num_episodes=10, num_steps=20, n_trials=500, discount_factor=0.95, time_out_in_seconds=3.0, is_risk_averse=False)[source]
Return type:

List[HyperParameterRunParams]

Parameters:
POMDPPlanners.configs.experiment_configs.get_benchmarks_hyperparameter_optimization_configs(conf, discount_factor, time_out_in_seconds=3.0)[source]
Return type:

List[HyperParameterRunParams]

Parameters:
POMDPPlanners.configs.experiment_configs.get_hyperparameter_benchmarks(policy_space_info, particle_count=30, time_out_in_seconds=3.0)[source]
Return type:

List[Tuple[Environment, Belief, List[HyperParamPlannerConfig]]]

Parameters:

POMDPPlanners.configs.planners_hyperparam_configs module

class POMDPPlanners.configs.planners_hyperparam_configs.PlannersHyperparamConfigs(discount_factor)[source]

Bases: object

Parameters:

discount_factor (float)

discrete_action_sequences_config(env, name)[source]
Return type:

HyperParamPlannerConfig

Parameters:
get_compatible_planners(env, time_out_in_seconds=3.0)[source]

Get all planners that are compatible with the given environment.

This function analyzes the environment’s space information and returns a list of configured planners that can solve this environment. The compatibility is determined by checking if the planner’s space requirements match the environment’s space types, following the logic from Policy._verify_environment_compatibility.

Parameters:
  • env (Environment) – The POMDP environment to find compatible planners for

  • time_out_in_seconds (float) – Time limit for each planner. Defaults to 3.0.

Return type:

List[HyperParamPlannerConfig]

Returns:

List of HyperParamPlannerConfig objects for compatible planners

Note

  • For discrete action spaces, uses DiscreteActionSampler

  • For continuous action spaces, uses UnitCircleActionSampler

  • Only includes planners that can handle the environment’s space types

pft_dpw_config(env, action_sampler, name, time_out_in_seconds=3.0)[source]
Return type:

HyperParamPlannerConfig

Parameters:
pomcp_config(env, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
pomcp_dpw_config(env, action_sampler, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
pomcpow_config(env, action_sampler, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
sparse_pft_config(env, name, time_out_in_seconds=3)[source]
Return type:

HyperParamPlannerConfig

Parameters:
sparse_sampling_config(env, name)[source]
Return type:

HyperParamPlannerConfig

Parameters: