POMDPPlanners.planners.sparse_sampling_planners package
Submodules
POMDPPlanners.planners.sparse_sampling_planners.icvar_sparse_sampling module
ICVaR Sparse Sampling POMDP Planning Algorithm Implementation.
This module implements a risk-sensitive variant of the sparse sampling algorithm for POMDP planning. Instead of using the expected value (mean) for Bellman backups, it uses the Conditional Value at Risk (CVaR) to focus on the worst-alpha fraction of outcomes.
- Reference:
Pariente, Y., & Indelman, V. (2026). Online Risk-Averse Planning in POMDPs Using Iterated CVaR Value Function. arXiv preprint arXiv:2601.20554. https://arxiv.org/abs/2601.20554
- Classes:
ICVaRSparseSampling: Risk-sensitive sparse sampling with CVaR-based value updates
- class POMDPPlanners.planners.sparse_sampling_planners.icvar_sparse_sampling.ICVaRSparseSampling(environment, branching_factor, depth, alpha, name='ICVaRSparseSampling')[source]
Bases:
SparseSamplingDiscreteActionsPlannerRisk-sensitive sparse sampling planner using CVaR for value backups.
This planner extends the standard sparse sampling algorithm by replacing the expected value (mean) in Q-value computation with the Conditional Value at Risk (CVaR). CVaR focuses on the worst-alpha fraction of outcomes, making the planner risk-sensitive.
- The standard Q-value update uses:
Q = immediate_cost + gamma * mean(child_v_values)
- The ICVaR variant replaces this with:
Q = immediate_cost + gamma * CVaR_alpha(child_v_values)
- Parameters:
environment (DiscreteActionsEnvironment)
branching_factor (int)
depth (int)
alpha (float)
name (str)
- alpha
CVaR confidence level (0 < alpha <= 1). Lower alpha means more risk-sensitive (focuses on worse outcomes). alpha=1.0 recovers the standard expected value.
Example
>>> import numpy as np >>> from POMDPPlanners.environments.tiger_pomdp import TigerPOMDP >>> from POMDPPlanners.core.belief import get_initial_belief >>> np.random.seed(42) # For reproducible results >>> >>> # Create environment and risk-sensitive planner >>> tiger = TigerPOMDP(discount_factor=0.95) >>> planner = ICVaRSparseSampling( ... environment=tiger, ... branching_factor=2, ... depth=2, ... alpha=0.3, ... name="ICVaRPlanner" ... ) >>> >>> # Basic planner interface usage >>> planner.name 'ICVaRPlanner' >>> planner.alpha 0.3 >>> >>> # Action selection from belief >>> initial_belief = get_initial_belief(tiger, n_particles=10) >>> actions, run_data = planner.action(initial_belief) >>> >>> # Planner space information >>> space_info = ICVaRSparseSampling.get_space_info() >>> space_info.action_space.name 'DISCRETE'
- classmethod get_space_info()[source]
Get space type requirements for this policy class.
This class method specifies what types of action and observation spaces this policy implementation can handle, enabling compatibility checking with environments.
- Return type:
- Returns:
PolicySpaceInfo specifying required action and observation space types
Note
Subclasses must implement this method to declare their space compatibility. This is used for validation when pairing policies with environments.
POMDPPlanners.planners.sparse_sampling_planners.sparse_sampling module
Sparse Sampling POMDP Planning Algorithm Implementation.
This module implements the sparse sampling algorithm for POMDP planning, which builds a finite-depth lookahead tree by sampling a limited number of outcomes at each node. The algorithm provides theoretical guarantees on the quality of the computed policy.
The sparse sampling approach works by: 1. Building a finite-depth tree from the current belief 2. Sampling a fixed number of next states and observations at each node 3. Computing value estimates using dynamic programming 4. Selecting the action with the best estimated value
- Reference:
Kearns, M., Mansour, Y., & Ng, A. Y. (2002). A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes. Machine Learning, 49, 193-208. https://link.springer.com/article/10.1023/A:1017932429737
- Classes:
BaseSparseSamplingDiscreteActionsPlanner: Abstract base class for sparse sampling algorithms SparseSamplingDiscreteActionsPlanner: Concrete implementation with standard value updates
- class POMDPPlanners.planners.sparse_sampling_planners.sparse_sampling.BaseSparseSamplingDiscreteActionsPlanner(environment, branching_factor, depth, resampling=False, name='BaseSparseSamplingDiscreteActionsPlanner', log_path=None, debug=False)[source]
-
Abstract base class for sparse sampling POMDP planners.
This class implements the core sparse sampling algorithm for POMDP planning. It builds a finite-depth lookahead tree by sampling a limited number of outcomes at each node, providing theoretical guarantees on policy quality.
The algorithm works by building a tree where: - Each belief node represents a belief state - Each action node represents taking an action from a belief - The tree depth is limited to control computational complexity - Value estimates are computed using dynamic programming
- Parameters:
- environment
The POMDP environment to plan for
- branching_factor
Number of samples at each node (controls tree width)
- depth
Maximum planning depth (controls tree height)
- resampling
Whether to resample particles during belief updates
Note
This is an abstract base class and cannot be instantiated directly. Subclasses must implement the value update methods for leaf and non-leaf nodes.
- action(belief)[source]
Select action(s) based on the current belief state.
This is the core method that implements the policy’s decision-making logic. It takes a belief state and returns the selected action(s) along with execution information and performance metrics.
- Parameters:
belief (
Belief) – Current belief state representing uncertainty over states- Returns:
List of selected actions (typically single action, but supports multiple)
PolicyRunData with execution metrics and performance information
- Return type:
Note
Subclasses must implement this method with their specific planning or decision-making algorithm.
- classmethod get_info_variable_names()[source]
Get names of policy info variables.
Sparse sampling planner does not produce any info variables.
- classmethod get_space_info()[source]
Get space type requirements for this policy class.
This class method specifies what types of action and observation spaces this policy implementation can handle, enabling compatibility checking with environments.
- Return type:
- Returns:
PolicySpaceInfo specifying required action and observation space types
Note
Subclasses must implement this method to declare their space compatibility. This is used for validation when pairing policies with environments.
- class POMDPPlanners.planners.sparse_sampling_planners.sparse_sampling.SparseSamplingDiscreteActionsPlanner(environment, branching_factor, depth, name='SparseSamplingDiscreteActionsPlanner')[source]
Bases:
BaseSparseSamplingDiscreteActionsPlannerStandard implementation of sparse sampling for POMDP planning.
This concrete implementation of sparse sampling uses standard value updates: - Q-values for actions are computed as immediate cost plus discounted future value - V-values for beliefs are computed as the minimum Q-value over actions (cost formulation) - Leaf nodes use only immediate cost estimates
The algorithm provides theoretical guarantees: with probability 1-δ, the computed policy is ε-optimal, where ε decreases with increasing depth and branching factor.
Example
>>> import numpy as np >>> from POMDPPlanners.environments.tiger_pomdp import TigerPOMDP >>> from POMDPPlanners.core.belief import get_initial_belief >>> np.random.seed(42) # For reproducible results >>> >>> # Create environment and planner >>> tiger = TigerPOMDP(discount_factor=0.95) >>> planner = SparseSamplingDiscreteActionsPlanner( ... environment=tiger, ... branching_factor=2, ... depth=2, ... name="ExamplePlanner" ... ) >>> >>> # Basic planner interface usage >>> planner.name 'ExamplePlanner' >>> >>> # Action selection from belief >>> initial_belief = get_initial_belief(tiger, n_particles=10) >>> actions, run_data = planner.action(initial_belief) >>> >>> # Planner space information >>> space_info = SparseSamplingDiscreteActionsPlanner.get_space_info() >>> space_info.action_space.name 'DISCRETE'
- Parameters:
environment (DiscreteActionsEnvironment)
branching_factor (int)
depth (int)
name (str)