POMDPPlanners.core.belief package

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
state (Any | None)

abstractmethod sample()[source]

Sample a state from the current belief distribution.

Return type:: Any
Returns:: A state sampled according to the belief’s probability distribution

Note

Subclasses must implement this method to enable state sampling for planning and simulation purposes.

abstractmethod update(action, observation, pomdp, state=None)[source]

Update belief given an action-observation pair.

Performs Bayesian belief update using the environment’s transition and observation models.

Parameters:

action (Any) – Action that was executed
observation (Any) – Observation that was received
pomdp (Environment) – Environment providing transition and observation models
state (Any | None)

Return type:

Returns:

Updated belief state reflecting the new information

Note

Subclasses must implement this method according to their specific belief representation and update strategy.

class POMDPPlanners.core.belief.ExtendedKalmanFilterUpdater(transition_fn, observation_fn, transition_jacobian, observation_jacobian, Q, R)[source]

Extended Kalman Filter updater for nonlinear systems.

The system model is:

x_{t+1} = f(x_t, u_t) + w, w ~ N(0, Q) z_t = h(x_{t+1}) + v, v ~ N(0, R)

The EKF linearises around the current estimate using the provided Jacobians.

Parameters:

transition_fn (Callable[[ndarray, ndarray], ndarray])
observation_fn (Callable[[ndarray], ndarray])
transition_jacobian (Callable[[ndarray, ndarray], ndarray])
observation_jacobian (Callable[[ndarray], ndarray])
Q (ndarray)
R (ndarray)

transition_fn: State transition function f(state, action) -> next_state.

observation_fn: Observation function h(state) -> observation.

transition_jacobian: Jacobian of f w.r.t. state, F(state, action) -> (d, d).

observation_jacobian: Jacobian of h w.r.t. state, H(state) -> (p, d).

Q: Process noise covariance of shape (d, d).

R: Observation noise covariance of shape (p, p).

Example

>>> import numpy as np
>>> f = lambda x, u: x
>>> h = lambda x: x
>>> F = lambda x, u: np.eye(len(x))
>>> H = lambda x: np.eye(len(x))
>>> Q = 0.1 * np.eye(2)
>>> R = 0.5 * np.eye(2)
>>> updater = ExtendedKalmanFilterUpdater(
...     transition_fn=f, observation_fn=h,
...     transition_jacobian=F, observation_jacobian=H, Q=Q, R=R,
... )
>>> mean = np.zeros(2)
>>> cov = np.eye(2)
>>> new_mean, new_cov = updater.update(mean, cov, np.zeros(1), np.array([1.0, 0.0]))
>>> new_mean.shape
(2,)

property config_id: str: Return a deterministic identifier for this updater configuration.

update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean – Prior mean vector of shape (d,).
covariance – Prior covariance matrix of shape (d, d).
action – Action that was executed.
observation – Observation that was received.

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.GaussianBelief(mean, covariance, updater, n_terminal_check_samples=50)[source]

Bases: Belief

Multivariate Gaussian belief state representation.

Represents the belief as a multivariate normal distribution N(mean, covariance). The update mechanism delegates to a GaussianBeliefUpdater instance, allowing the same class to work with EKF, UKF, or any custom Gaussian update rule without requiring environment modifications.

This belief type is compatible with PFT_DPW, Sparse-PFT, and SparseSampling planners. It is NOT compatible with POMCP/POMCP_DPW planners because it does not support incremental particle accumulation via inplace_update().

Parameters:

mean (ndarray)
covariance (ndarray)
updater (GaussianBeliefUpdater)
n_terminal_check_samples (int)

mean: Mean vector of the Gaussian distribution.

covariance: Covariance matrix of the Gaussian distribution.

updater: GaussianBeliefUpdater that computes the Bayesian belief update.

n_terminal_check_samples: Number of Monte Carlo samples for terminal checks.

Example

>>> import numpy as np
>>> np.random.seed(42)
>>>
>>> # Create a linear Kalman filter updater
>>> from POMDPPlanners.core.belief.gaussian_belief_updaters import (
...     LinearKalmanFilterUpdater,
... )
>>> updater = LinearKalmanFilterUpdater(
...     A=np.eye(2), B=np.zeros((2, 1)), H=np.eye(2),
...     Q=0.1 * np.eye(2), R=0.5 * np.eye(2),
... )
>>>
>>> # Create 2D Gaussian belief
>>> mean = np.array([0.0, 0.0])
>>> cov = np.eye(2)
>>> belief = GaussianBelief(mean=mean, covariance=cov, updater=updater)
>>>
>>> # Sample a state
>>> state = belief.sample()
>>> len(state) == 2
True
>>>
>>> # Update belief with observation
>>> new_belief = belief.update(
...     action=np.zeros(1), observation=np.array([1.0, 1.0]), pomdp=None
... )
>>> new_belief.mean.shape
(2,)

property config_id: str: Generate a deterministic identifier based on belief configuration.

property dim: int: Return the dimensionality of the Gaussian belief.

entropy()[source]

Compute the differential entropy of the Gaussian distribution.

Uses the closed-form expression:: H = 0.5 * (d * ln(2 * pi * e) + ln(det(Sigma)))

Return type:: float
Returns:: Differential entropy in nats.

sample()[source]

Sample a state from the Gaussian belief.

Return type:: ndarray
Returns:: A state vector of shape (d,) sampled from N(mean, covariance).

update(action, observation, pomdp=None, state=None)[source]

Update belief using the provided updater.

Parameters:

action (Any) – Action that was executed.
observation (Any) – Observation that was received.
pomdp (Optional[Environment]) – Unused. Kept for interface compatibility with Belief.
state (Optional[Any]) – Ignored for Gaussian beliefs.

Return type:

GaussianBelief

Returns:

New GaussianBelief with updated mean and covariance.

class POMDPPlanners.core.belief.GaussianBeliefUpdater[source]

Bases: ABC

Abstract base class for Gaussian belief updaters.

Subclasses implement a Bayesian predict-correct cycle that maps (mean, covariance, action, observation) to an updated (new_mean, new_covariance) pair.

Note

This is an abstract base class and cannot be instantiated directly.

abstract property config_id: str: Return a deterministic identifier for this updater configuration.

abstractmethod update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean (ndarray) – Prior mean vector of shape (d,).
covariance (ndarray) – Prior covariance matrix of shape (d, d).
action (ndarray) – Action that was executed.
observation (ndarray) – Observation that was received.

Return type:

Tuple[ndarray, ndarray]

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.GaussianMixtureBelief(means, covariances, weights, updater, n_terminal_check_samples=50)[source]

Bases: Belief

Gaussian Mixture Model belief state representation.

Represents the belief as a weighted mixture of multivariate normal distributions: p(x) = sum_k w_k * N(x; mu_k, Sigma_k). The update mechanism delegates to a GaussianMixtureBeliefUpdater instance, allowing flexibility in how mixture components are updated, pruned, or merged.

This belief type is compatible with PFT_DPW, Sparse-PFT, and SparseSampling planners. It is NOT compatible with POMCP/POMCP_DPW planners because it does not support incremental particle accumulation via inplace_update().

Parameters:

means (List[ndarray])
covariances (List[ndarray])
weights (ndarray)
updater (GaussianMixtureBeliefUpdater)
n_terminal_check_samples (int)

means: List of mean vectors, one per component.

covariances: List of covariance matrices, one per component.

weights: Array of mixture weights summing to 1.

updater: GaussianMixtureBeliefUpdater that computes the Bayesian belief update.

n_terminal_check_samples: Number of Monte Carlo samples for terminal checks.

Example

>>> import numpy as np
>>> np.random.seed(42)
>>>
>>> # Define a simple updater that shrinks covariances
>>> from POMDPPlanners.core.belief.gaussian_mixture_belief import (
...     GaussianMixtureBeliefUpdater,
... )
>>> class ShrinkUpdater(GaussianMixtureBeliefUpdater):
...     def update(self, means, covs, weights, action, obs):
...         return means, [c * 0.9 for c in covs], weights
...     @property
...     def config_id(self):
...         return "shrink"
>>>
>>> # Create a 2-component GMM belief in 2D
>>> means = [np.array([0.0, 0.0]), np.array([3.0, 3.0])]
>>> covs = [np.eye(2), np.eye(2)]
>>> weights = np.array([0.5, 0.5])
>>> belief = GaussianMixtureBelief(
...     means=means, covariances=covs, weights=weights, updater=ShrinkUpdater(),
... )
>>>
>>> # Sample a state
>>> state = belief.sample()
>>> len(state) == 2
True
>>>
>>> # Update belief
>>> new_belief = belief.update(
...     action=0, observation=np.array([1.0, 1.0]), pomdp=None
... )
>>> new_belief.n_components == 2
True

property config_id: str: Generate a deterministic identifier based on belief configuration.

property dim: int: Return the dimensionality of the belief state.

entropy(n_samples=1000)[source]

Estimate the differential entropy via Monte Carlo sampling.

There is no closed-form expression for the entropy of a Gaussian mixture, so this method uses the approximation:

H ~ -mean(log p(x_i)), x_i ~ p(x)

Parameters:: n_samples (int) – Number of Monte Carlo samples. Defaults to 1000.
Return type:: float
Returns:: Estimated differential entropy in nats.

property n_components: int: Return the number of mixture components.

sample()[source]

Sample a state from the Gaussian mixture belief.

Selects a component according to the mixture weights, then draws a sample from that component’s Gaussian distribution.

Return type:: ndarray
Returns:: A state vector of shape (d,).

update(action, observation, pomdp=None, state=None)[source]

Update belief using the provided updater.

Parameters:

action (Any) – Action that was executed.
observation (Any) – Observation that was received.
pomdp (Optional[Environment]) – Unused. Kept for interface compatibility with Belief.
state (Optional[Any]) – Ignored for Gaussian mixture beliefs.

Return type:

GaussianMixtureBelief

Returns:

New GaussianMixtureBelief with updated components and weights.

class POMDPPlanners.core.belief.GaussianMixtureBeliefUpdater[source]

Bases: ABC

Abstract base class for Gaussian mixture belief updaters.

Subclasses implement an update cycle that maps (means, covariances, weights, action, observation) to an updated (new_means, new_covariances, new_weights) tuple.

Note

This is an abstract base class and cannot be instantiated directly.

abstract property config_id: str: Return a deterministic identifier for this updater configuration.

abstractmethod update(means, covariances, weights, action, observation)[source]

Perform a belief update for the Gaussian mixture.

Parameters:

means (List[ndarray]) – List of k mean vectors, each of shape (d,).
covariances (List[ndarray]) – List of k covariance matrices, each of shape (d, d).
weights (ndarray) – Mixture weights of shape (k,).
action (Any) – Action that was executed.
observation (Any) – Observation that was received.

Return type:

Tuple[List[ndarray], List[ndarray], ndarray]

Returns:

A tuple (new_means, new_covariances, new_weights).

class POMDPPlanners.core.belief.LinearKalmanFilterUpdater(A, B, H, Q, R)[source]

Kalman filter updater for a linear-Gaussian system.

The system model is:

x_{t+1} = A x_t + B u_t + w, w ~ N(0, Q) z_t = H x_{t+1} + v, v ~ N(0, R)

Parameters:

A: State transition matrix of shape (d, d).

B: Control input matrix of shape (d, m).

H: Observation matrix of shape (p, d).

Q: Process noise covariance of shape (d, d).

R: Observation noise covariance of shape (p, p).

Example

>>> import numpy as np
>>> A = np.eye(2)
>>> B = np.zeros((2, 1))
>>> H = np.eye(2)
>>> Q = 0.1 * np.eye(2)
>>> R = 0.5 * np.eye(2)
>>> updater = LinearKalmanFilterUpdater(A=A, B=B, H=H, Q=Q, R=R)
>>> mean = np.zeros(2)
>>> cov = np.eye(2)
>>> new_mean, new_cov = updater.update(mean, cov, np.zeros(1), np.array([1.0, 0.0]))
>>> new_mean.shape
(2,)

property config_id: str: Return a deterministic identifier for this updater configuration.

update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean – Prior mean vector of shape (d,).
covariance – Prior covariance matrix of shape (d, d).
action – Action that was executed.
observation – Observation that was received.

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.UnscentedKalmanFilterUpdater(transition_fn, observation_fn, Q, R, alpha=0.001, beta=2.0, kappa=0.0)[source]

UnweightedParticleBelief

Unscented Kalman Filter updater for nonlinear systems.

The system model is:

x_{t+1} = f(x_t, u_t) + w, w ~ N(0, Q) z_t = h(x_{t+1}) + v, v ~ N(0, R)

Unlike the EKF, the UKF does not require Jacobians. Instead, it propagates deterministic sigma points through the nonlinear functions to estimate the posterior statistics.

Parameters:

transition_fn (Callable[[ndarray, ndarray], ndarray])
observation_fn (Callable[[ndarray], ndarray])
Q (ndarray)
R (ndarray)
alpha (float)
beta (float)
kappa (float)

transition_fn: State transition function f(state, action) -> next_state.

observation_fn: Observation function h(state) -> observation.

Q: Process noise covariance of shape (d, d).

R: Observation noise covariance of shape (p, p).

alpha: Spread of sigma points around the mean.

beta: Prior knowledge about the distribution (2.0 is optimal for Gaussian).

kappa: Secondary scaling parameter.

Example

>>> import numpy as np
>>> f = lambda x, u: x
>>> h = lambda x: x
>>> Q = 0.1 * np.eye(2)
>>> R = 0.5 * np.eye(2)
>>> updater = UnscentedKalmanFilterUpdater(
...     transition_fn=f, observation_fn=h, Q=Q, R=R,
... )
>>> mean = np.zeros(2)
>>> cov = np.eye(2)
>>> new_mean, new_cov = updater.update(mean, cov, np.zeros(1), np.array([1.0, 0.0]))
>>> new_mean.shape
(2,)

property config_id: str: Return a deterministic identifier for this updater configuration.

update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean – Prior mean vector of shape (d,).
covariance – Prior covariance matrix of shape (d, d).
action – Action that was executed.
observation – Observation that was received.

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.UnweightedParticleBelief(particles, reinvigoration_fraction=0.2)[source]

Bases: Belief

Unweighted particle belief implementation.

This class implements a particle filter with uniform particles.

Parameters:: particles (list)

reinvigorate(action, observation, pomdp)[source]

Simulate a new particle that matches the action-observation pair.

Parameters:

action (Any)
observation (Any)
pomdp (Environment)

sample()[source]: Sample a particle from the belief.

update(action, observation, pomdp, state=None)[source]

Update belief with action-observation pair.

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
state (Any | None)

class POMDPPlanners.core.belief.UnweightedParticleBeliefStateUpdate(particles=None)[source]

Bases: Belief

Uniform particle belief for incremental state accumulation.

This class implements a lightweight belief representation that maintains a collection of state particles with uniform probability distribution. Unlike weighted particle filters, all particles contribute equally to the belief state, making it suitable for discrete observation spaces where observation likelihoods are binary (match/no-match) rather than continuous probability distributions.

UnweightedParticleBeliefStateUpdate is designed for online learning and planning algorithms that incrementally accumulate particles during tree expansion or sequential state estimation. It provides both in-place and immutable update operations for different algorithmic requirements.

Key Features: - Uniform Weighting: All particles have equal probability weight - Incremental Accumulation: Add particles one-by-one without resampling - Memory Efficient: No weight storage, minimal memory overhead - Fast Sampling: Simple uniform random sampling from particle set - Efficient Updates: Both in-place and immutable update operations - Deterministic Config ID: Order-invariant identification for caching

Mathematical Foundation: The belief represents a discrete uniform distribution over accumulated particles. Each particle has equal probability 1/N where N is the total number of particles. For particles with the same state value, the probability is proportional to their count:

P(s) = count(s) / N = |{i: s_i = s}| / |particles|

This makes it ideal for discrete observation models where observations either match a state (probability 1) or don’t match (probability 0).

Parameters:: particles (list | None)

particles: List of state particles, each with uniform probability

weights_sum: Total number of particles (equivalent to uniform weight sum)

Example

>>> from POMDPPlanners.environments.tiger_pomdp import TigerPOMDP
>>> env = TigerPOMDP(discount_factor=0.95)
>>> belief = UnweightedParticleBeliefStateUpdate(particles=[])
>>> belief.inplace_update("listen", "hear_left", env, "tiger_left")
>>> belief.inplace_update("listen", "hear_right", env, "tiger_right")
>>> sampled_state = belief.sample()
>>> sampled_state in ["tiger_left", "tiger_right"]
True

property config_id: str

Generate a deterministic identifier based on belief configuration.

This implementation ensures that config_id is invariant to the order of particles by sorting them.

inplace_update(action, observation, pomdp, state=None)[source]

Add a state particle with uniform weight to current belief.

This method modifies the current belief in-place by appending a new particle. Unlike weighted beliefs, no observation likelihood computation is performed; the new particle simply joins the uniform distribution.

Parameters:

action (Any) – Action that was executed to reach the state (not used for weighting)
observation (Any) – Observation received after executing the action (not used for weighting)
pomdp (Environment) – Environment instance (not used for uniform weighting)
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

UnweightedParticleBeliefStateUpdate

sample()[source]

Sample a state uniformly from the current belief distribution.

Return type:: Any
Returns:: A state sampled uniformly from the particle set
Raises:: IndexError – If belief is empty (no particles to sample from)

update(action, observation, pomdp, state=None)[source]

Create new belief by adding a state particle with uniform weight.

This method creates a new belief instance without modifying the current one. Unlike weighted beliefs, all particles (including the new one) have equal probability in the resulting belief distribution.

Parameters:

action (Any) – Action that was executed to reach the state (not used for weighting)
observation (Any) – Observation received after executing the action (not used for weighting)
pomdp (Environment) – Environment instance (not used for uniform weighting)
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

Returns:

New UnweightedParticleBeliefStateUpdate instance with the additional particle.

class POMDPPlanners.core.belief.VectorizedParticleBeliefUpdater[source]

Bases: ABC

Abstract base class for vectorized particle belief updaters.

Subclasses implement batched transition and observation log-likelihood methods that operate on the full particle array at once, enabling NumPy-level vectorization instead of Python loops.

Note

This is an abstract base class and cannot be instantiated directly.

abstractmethod batch_observation_log_likelihood(next_particles, action, observation)[source]

Compute observation log-likelihoods for all particles at once.

Parameters:

next_particles (ndarray) – Transitioned particle states of shape (N, d).
action (ndarray) – Action vector.
observation (ndarray) – Observed value.

Return type:

Returns:

Log-likelihoods of shape (N,).

abstractmethod batch_transition(particles, action)[source]

Transition all particles in a single batched operation.

Parameters:

particles (ndarray) – Current particle states of shape (N, d).
action (ndarray) – Action vector.

Return type:

VectorizedWeightedParticleBelief

Returns:

Next-state particles of shape (N, d).

abstract property config_id: str: Return a deterministic identifier for this updater configuration.

class POMDPPlanners.core.belief.VectorizedWeightedParticleBelief(particles, log_weights, updater, resampling=False, ess_factor=0.5)[source]

Bases: Belief

Vectorized weighted particle filter for POMDP belief states.

Stores particles as a 2-D NumPy array of shape (N, d) and performs all update operations via a VectorizedParticleBeliefUpdater, so the entire predict-reweight-resample cycle runs without Python loops over particles.

Parameters:

particles (ndarray)
log_weights (ndarray)
updater (VectorizedParticleBeliefUpdater)
resampling (bool)
ess_factor (float)

particles: Particle array of shape (N, d).

log_weights: Log-weights of shape (N,).

normalized_weights: Probability weights of shape (N,).

updater: Vectorized updater instance.

resampling: Whether automatic ESS-based resampling is enabled.

ess_factor: Fraction of N used as the ESS threshold.

Example

>>> import numpy as np
>>> np.random.seed(42)
>>>
>>> # Create a trivial identity updater for demonstration
>>> from POMDPPlanners.core.belief.vectorized_particle_belief_updater import (
...     VectorizedParticleBeliefUpdater,
... )
>>> class IdentityUpdater(VectorizedParticleBeliefUpdater):
...     def batch_transition(self, particles, action):
...         return particles + action
...     def batch_observation_log_likelihood(self, next_particles, action, observation):
...         return np.zeros(len(next_particles))
...     @property
...     def config_id(self):
...         return "identity"
>>>
>>> particles = np.random.randn(20, 2)
>>> log_w = np.log(np.ones(20) / 20)
>>> belief = VectorizedWeightedParticleBelief(
...     particles=particles,
...     log_weights=log_w,
...     updater=IdentityUpdater(),
...     resampling=True,
... )
>>> state = belief.sample()
>>> state.shape
(2,)
>>> new_belief = belief.update(
...     action=np.array([1.0, 0.0]),
...     observation=np.array([0.5, 0.5]),
...     pomdp=None,
... )
>>> new_belief.particles.shape
(20, 2)

property config_id: str: Generate a deterministic identifier based on belief configuration.

property dim: int: Return the state dimensionality.

property n_particles: int: Return the number of particles.

sample()[source]

Sample a state from the belief.

Return type:: ndarray
Returns:: A state vector of shape (d,).

update(action, observation, pomdp=None, state=None)[source]

Update belief using the vectorized updater.

Parameters:

action (Any) – Action that was executed.
observation (Any) – Observation that was received.
pomdp (Optional[Environment]) – Unused. Kept for interface compatibility.
state (Optional[Any]) – Ignored.

Return type:

Returns:

New VectorizedWeightedParticleBelief with updated particles and weights.

class POMDPPlanners.core.belief.WeightedParticleBelief(particles, log_weights, resampling=False, ess_factor=0.5)[source]

Bases: Belief

Weighted particle filter implementation for POMDP belief states.

This class implements a particle filter with weighted particles, suitable for continuous observation spaces. It supports automatic resampling based on effective sample size to maintain particle diversity.

Parameters:

particles (List[Any])
log_weights (ndarray)
resampling (bool)
ess_factor (float)

particles: List of state particles representing the belief

log_weights: Log-weights of particles in log space for numerical stability

normalized_weights: Normalized probability weights (computed automatically)

resampling: Whether automatic resampling is enabled

ess_factor: Effective sample size factor for resampling threshold

ess_threshold: Computed threshold for triggering resampling

eps: Small epsilon value for numerical stability in weight updates

Example

>>> import numpy as np
>>> # Create belief with 10 particles (smaller for testing)
>>> particles = [[x, y] for x, y in zip(np.random.randn(10), np.random.randn(10))]
>>> log_weights = np.log(np.ones(10) / 10)  # Uniform weights

>>> belief = WeightedParticleBelief(
...     particles=particles,
...     log_weights=log_weights,
...     resampling=True,
...     ess_factor=0.5
... )
>>> # Sample a state from belief
>>> state = belief.sample()
>>> len(state) == 2  # [x, y] coordinate
True

property config_id: str: Generate a deterministic identifier based on belief configuration.

sample()[source]: Sample a particle from the belief.

to_dict()[source]

Convert the belief to a dictionary for serialization.

Returns:: A dictionary containing all necessary fields for deserialization.
Return type:: dict

to_unique_support_distribution()[source]

Convert the belief to a DiscreteDistribution with unique particles.

Returns:: A distribution where each particle appears only once, with its probability being the sum of all its occurrences in the original belief.
Return type:: DiscreteDistribution

update(action, observation, pomdp, state=None)[source]

Update belief with action-observation pair.

Return type:

Parameters:

pomdp (Environment)
state (Any | None)

class POMDPPlanners.core.belief.WeightedParticleBeliefReinvigoration(particles, log_weights, resampling=True, ess_factor=0.5, reinvigoration_fraction=0.2)[source]

Bases: WeightedParticleBelief, ABC

Weighted particle belief with reinvigoration capability.

Parameters:

particles (List[Any])
log_weights (ndarray)
resampling (bool)
ess_factor (float)
reinvigoration_fraction (float)

abstractmethod reinvigorate(action, observation, pomdp, belief)[source]

Implement reinvigoration for specific POMDP environment.

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
belief (Belief)

update(action, observation, pomdp, state=None)[source]

Update belief with reinvigoration.

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
state (Any | None)

class POMDPPlanners.core.belief.WeightedParticleBeliefStateUpdate(particles=None, weights=None)[source]

Bases: Belief

Incremental weighted particle belief for online state estimation.

This class implements a lightweight belief representation that incrementally accumulates state particles with associated observation likelihood weights. It is designed for online learning and planning algorithms that build beliefs by sequentially adding individual state samples rather than maintaining a fixed-size particle set.

Unlike traditional particle filters that maintain a fixed number of particles through resampling, WeightedParticleBeliefStateUpdate grows dynamically by accumulating particles with observation-based weights. This makes it particularly suitable for Monte Carlo Tree Search (MCTS) algorithms where beliefs are built incrementally during tree expansion.

Key Features: - Incremental Accumulation: Add particles one-by-one without resampling - Observation Weighting: Each particle weighted by observation likelihood - Efficient Updates: Both in-place and immutable update operations - Weighted Sampling: Sample states proportionally to observation evidence - Memory Efficient: No fixed particle budget, grows as needed - Deterministic Config ID: Order-invariant identification for caching

Mathematical Foundation: The belief represents a discrete probability distribution where each particle s_i has weight w_i = P(o|s_i,a), the observation likelihood. The probability of state s is proportional to the sum of weights for all particles with that state:

P(s|o,a) ∝ Σ_{i: s_i=s} w_i

Parameters:

particles (list | None)
weights (list | None)

particles: List of state particles representing possible world states

weights: List of observation likelihood weights for each particle

weights_sum: Running sum of all weights for efficient normalization

Example

>>> from POMDPPlanners.environments.tiger_pomdp import TigerPOMDP
>>> env = TigerPOMDP(discount_factor=0.95)
>>> belief = WeightedParticleBeliefStateUpdate(particles=[], weights=[])
>>> belief.inplace_update("listen", "hear_left", env, "tiger_left")
>>> belief.inplace_update("listen", "hear_left", env, "tiger_right")
>>> sampled_state = belief.sample()
>>> sampled_state in ["tiger_left", "tiger_right"]
True

property config_id: str

Generate a deterministic identifier based on belief configuration.

This implementation ensures that config_id is invariant to the order of particles and weights, similar to WeightedParticleBelief.

inplace_update(action, observation, pomdp, state=None)[source]

Add a state particle with observation weight to current belief.

This method modifies the current belief in-place by appending a new particle and its corresponding observation likelihood weight. The weight is computed using the environment’s observation model and efficiently updates the running weight sum.

Parameters:

action (Any) – Action that was executed to reach the state
observation (Any) – Observation received after executing the action
pomdp (Environment) – Environment providing the observation model for weight computation
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

WeightedParticleBeliefStateUpdate

sample()[source]

Sample a state from the current belief distribution.

Return type:: Any
Returns:: A state sampled according to the belief’s probability distribution
Raises:: ValueError – If belief is empty or has zero weights.

to_unique_support_distribution()[source]

Convert the belief to a DiscreteDistribution with unique particles.

Returns:: A distribution where each particle appears only once, with its probability being the sum of all its occurrences in the original belief.
Return type:: DiscreteDistribution

update(action, observation, pomdp, state=None)[source]

Create new belief by adding a state particle with observation weight.

This method creates a new belief instance without modifying the current one. The new particle’s weight is computed as the observation likelihood given the state and action using the environment’s observation model.

Parameters:

action (Any) – Action that was executed to reach the state
observation (Any) – Observation received after executing the action
pomdp (Environment) – Environment providing the observation model for weight computation
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

Returns:

New WeightedParticleBeliefStateUpdate instance with the additional particle and updated weights.

POMDPPlanners.core.belief.get_initial_belief(pomdp, n_particles, resampling=True)[source]

Create initial belief from environment’s initial state distribution.

Parameters:

pomdp (Environment) – Environment to get initial distribution from
n_particles (int) – Number of particles to generate for the belief
resampling (bool) – Enable resampling in the created belief. Defaults to True.

Return type:

Returns:

WeightedParticleBelief with uniform weights over initial states

Raises:

TypeError – If n_particles is not an integer
ValueError – If n_particles is not positive

POMDPPlanners.core.belief.get_unique_support(particles, probabilities)[source]

Extract unique particles and their combined probabilities.

This function takes a list of particles and their associated probabilities, combines probabilities for duplicate particles, and returns unique particles with normalized probabilities.

Parameters:

particles (List[Any]) – List of particles of any type
probabilities (ndarray) – Array of probabilities/weights corresponding to each particle

Returns:

List of unique particles (preserving original types)
Normalized numpy array of probabilities summing to 1

Return type:

Tuple[List[Any], ndarray]

Example

>>> particles = [1, 2, 1, 3, 2]
>>> probs = np.array([0.2, 0.3, 0.1, 0.2, 0.2])
>>> unique_particles, unique_probs = get_unique_support(particles, probs)
>>> unique_particles  # [1, 2, 3]
[1, 2, 3]
>>> float(np.sum(unique_probs))  # Should be 1.0
1.0

POMDPPlanners.core.belief.is_terminal_belief(belief, env)[source]

Check if the belief is terminal.

Return type:

Parameters:

belief (Belief)
env (Environment)

POMDPPlanners.core.belief.is_terminal_particle_belief(belief, env)[source]

Check if the belief is terminal.

Return type:

Parameters:

belief (WeightedParticleBelief | WeightedParticleBeliefStateUpdate | UnweightedParticleBeliefStateUpdate)
env (Environment)

POMDPPlanners.core.belief.sample_next_belief(belief, action, pomdp)[source]

Simulate one step of belief evolution.

This function samples a state from the current belief, simulates the environment dynamics, and updates the belief with the resulting observation.

Parameters:

belief (Belief) – Current belief state
action (Any) – Action to execute
pomdp (Environment) – Environment providing dynamics models

Returns:

Updated belief after incorporating the observation
Observation that was generated

Return type:

Tuple[Belief, Any]

Submodules

POMDPPlanners.core.belief.base_belief module

Abstract base class for POMDP belief state representations.

This module provides the foundational Belief abstract base class that defines the interface for all belief state representations in POMDP environments.

Classes:: Belief: Abstract base class for belief representations

class POMDPPlanners.core.belief.base_belief.Belief[source]

Bases: ABC

Abstract base class for POMDP belief state representations.

This class defines the interface for belief states in POMDP environments. Belief states represent probability distributions over the state space, capturing the agent’s uncertainty about the current state.

Note

This is an abstract base class and cannot be instantiated directly. Subclasses must implement the update() and sample() methods.

property config_id: str: Generate a deterministic identifier based on belief configuration.

classmethod from_config(config)[source]

Create a belief instance from configuration.

Factory method that dynamically creates belief instances based on configuration objects specifying the class name and parameters.

Parameters:: config – Configuration object with class_name and params attributes
Returns:: New belief instance of the specified type
Raises:: ValueError – If the specified belief class is not found

inplace_update(action, observation, pomdp, state=None)[source]

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
state (Any | None)

abstractmethod sample()[source]

Sample a state from the current belief distribution.

Return type:: Any
Returns:: A state sampled according to the belief’s probability distribution

Note

Subclasses must implement this method to enable state sampling for planning and simulation purposes.

abstractmethod update(action, observation, pomdp, state=None)[source]

Update belief given an action-observation pair.

Performs Bayesian belief update using the environment’s transition and observation models.

Parameters:

action (Any) – Action that was executed
observation (Any) – Observation that was received
pomdp (Environment) – Environment providing transition and observation models
state (Any | None)

Return type:

Returns:

Updated belief state reflecting the new information

Note

Subclasses must implement this method according to their specific belief representation and update strategy.

POMDPPlanners.core.belief.belief_utils module

Module-level helper functions for belief state operations.

This module provides utility functions for common belief operations such as sampling the next belief, creating initial beliefs, and checking terminal conditions across different belief types.

Functions:: sample_next_belief: Simulate one step of belief evolution get_initial_belief: Create initial belief from environment’s initial distribution is_terminal_particle_belief: Check if a particle belief is terminal is_terminal_belief: Check if any belief type is terminal

POMDPPlanners.core.belief.belief_utils.get_initial_belief(pomdp, n_particles, resampling=True)[source]

Create initial belief from environment’s initial state distribution.

Parameters:

pomdp (Environment) – Environment to get initial distribution from
n_particles (int) – Number of particles to generate for the belief
resampling (bool) – Enable resampling in the created belief. Defaults to True.

Return type:

Returns:

WeightedParticleBelief with uniform weights over initial states

Raises:

TypeError – If n_particles is not an integer
ValueError – If n_particles is not positive

POMDPPlanners.core.belief.belief_utils.is_terminal_belief(belief, env)[source]

Check if the belief is terminal.

Return type:

Parameters:

belief (Belief)
env (Environment)

POMDPPlanners.core.belief.belief_utils.is_terminal_particle_belief(belief, env)[source]

Check if the belief is terminal.

Return type:

Parameters:

belief (WeightedParticleBelief | WeightedParticleBeliefStateUpdate | UnweightedParticleBeliefStateUpdate)
env (Environment)

POMDPPlanners.core.belief.belief_utils.sample_next_belief(belief, action, pomdp)[source]

Simulate one step of belief evolution.

This function samples a state from the current belief, simulates the environment dynamics, and updates the belief with the resulting observation.

Parameters:

belief (Belief) – Current belief state
action (Any) – Action to execute
pomdp (Environment) – Environment providing dynamics models

Returns:

Updated belief after incorporating the observation
Observation that was generated

Return type:

Tuple[Belief, Any]

POMDPPlanners.core.belief.gaussian_belief module

Gaussian belief state representation for POMDP environments.

This module provides a multivariate Gaussian belief state that delegates updates to a GaussianBeliefUpdater instance, allowing compatibility with EKF, UKF, or any custom Gaussian update rule.

Classes:: GaussianBelief: Multivariate Gaussian belief with pluggable updater.

class POMDPPlanners.core.belief.gaussian_belief.GaussianBelief(mean, covariance, updater, n_terminal_check_samples=50)[source]

Bases: Belief

Multivariate Gaussian belief state representation.

Represents the belief as a multivariate normal distribution N(mean, covariance). The update mechanism delegates to a GaussianBeliefUpdater instance, allowing the same class to work with EKF, UKF, or any custom Gaussian update rule without requiring environment modifications.

This belief type is compatible with PFT_DPW, Sparse-PFT, and SparseSampling planners. It is NOT compatible with POMCP/POMCP_DPW planners because it does not support incremental particle accumulation via inplace_update().

Parameters:

mean (ndarray)
covariance (ndarray)
updater (GaussianBeliefUpdater)
n_terminal_check_samples (int)

mean: Mean vector of the Gaussian distribution.

covariance: Covariance matrix of the Gaussian distribution.

updater: GaussianBeliefUpdater that computes the Bayesian belief update.

n_terminal_check_samples: Number of Monte Carlo samples for terminal checks.

Example

>>> import numpy as np
>>> np.random.seed(42)
>>>
>>> # Create a linear Kalman filter updater
>>> from POMDPPlanners.core.belief.gaussian_belief_updaters import (
...     LinearKalmanFilterUpdater,
... )
>>> updater = LinearKalmanFilterUpdater(
...     A=np.eye(2), B=np.zeros((2, 1)), H=np.eye(2),
...     Q=0.1 * np.eye(2), R=0.5 * np.eye(2),
... )
>>>
>>> # Create 2D Gaussian belief
>>> mean = np.array([0.0, 0.0])
>>> cov = np.eye(2)
>>> belief = GaussianBelief(mean=mean, covariance=cov, updater=updater)
>>>
>>> # Sample a state
>>> state = belief.sample()
>>> len(state) == 2
True
>>>
>>> # Update belief with observation
>>> new_belief = belief.update(
...     action=np.zeros(1), observation=np.array([1.0, 1.0]), pomdp=None
... )
>>> new_belief.mean.shape
(2,)

property config_id: str: Generate a deterministic identifier based on belief configuration.

property dim: int: Return the dimensionality of the Gaussian belief.

entropy()[source]

Compute the differential entropy of the Gaussian distribution.

Uses the closed-form expression:: H = 0.5 * (d * ln(2 * pi * e) + ln(det(Sigma)))

Return type:: float
Returns:: Differential entropy in nats.

sample()[source]

Sample a state from the Gaussian belief.

Return type:: ndarray
Returns:: A state vector of shape (d,) sampled from N(mean, covariance).

update(action, observation, pomdp=None, state=None)[source]

Update belief using the provided updater.

Parameters:

action (Any) – Action that was executed.
observation (Any) – Observation that was received.
pomdp (Optional[Environment]) – Unused. Kept for interface compatibility with Belief.
state (Optional[Any]) – Ignored for Gaussian beliefs.

Return type:

GaussianBelief

Returns:

New GaussianBelief with updated mean and covariance.

POMDPPlanners.core.belief.gaussian_belief_updaters module

Gaussian belief updater abstract base class and concrete implementations.

This module provides the GaussianBeliefUpdater ABC and three concrete implementations for common Bayesian filtering algorithms. Each updater captures its system model parameters at construction time, so the update method only requires the current belief statistics and the latest action-observation pair.

Classes:: GaussianBeliefUpdater: Abstract base class for Gaussian belief updaters. LinearKalmanFilterUpdater: Updater for linear-Gaussian systems. ExtendedKalmanFilterUpdater: Updater for nonlinear systems with known Jacobians. UnscentedKalmanFilterUpdater: Updater for nonlinear systems without Jacobians.

class POMDPPlanners.core.belief.gaussian_belief_updaters.ExtendedKalmanFilterUpdater(transition_fn, observation_fn, transition_jacobian, observation_jacobian, Q, R)[source]

Extended Kalman Filter updater for nonlinear systems.

The system model is:

x_{t+1} = f(x_t, u_t) + w, w ~ N(0, Q) z_t = h(x_{t+1}) + v, v ~ N(0, R)

The EKF linearises around the current estimate using the provided Jacobians.

Parameters:

transition_fn (Callable[[ndarray, ndarray], ndarray])
observation_fn (Callable[[ndarray], ndarray])
transition_jacobian (Callable[[ndarray, ndarray], ndarray])
observation_jacobian (Callable[[ndarray], ndarray])
Q (ndarray)
R (ndarray)

transition_fn: State transition function f(state, action) -> next_state.

observation_fn: Observation function h(state) -> observation.

transition_jacobian: Jacobian of f w.r.t. state, F(state, action) -> (d, d).

observation_jacobian: Jacobian of h w.r.t. state, H(state) -> (p, d).

Q: Process noise covariance of shape (d, d).

R: Observation noise covariance of shape (p, p).

Example

>>> import numpy as np
>>> f = lambda x, u: x
>>> h = lambda x: x
>>> F = lambda x, u: np.eye(len(x))
>>> H = lambda x: np.eye(len(x))
>>> Q = 0.1 * np.eye(2)
>>> R = 0.5 * np.eye(2)
>>> updater = ExtendedKalmanFilterUpdater(
...     transition_fn=f, observation_fn=h,
...     transition_jacobian=F, observation_jacobian=H, Q=Q, R=R,
... )
>>> mean = np.zeros(2)
>>> cov = np.eye(2)
>>> new_mean, new_cov = updater.update(mean, cov, np.zeros(1), np.array([1.0, 0.0]))
>>> new_mean.shape
(2,)

property config_id: str: Return a deterministic identifier for this updater configuration.

update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean – Prior mean vector of shape (d,).
covariance – Prior covariance matrix of shape (d, d).
action – Action that was executed.
observation – Observation that was received.

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.gaussian_belief_updaters.GaussianBeliefUpdater[source]

Bases: ABC

Abstract base class for Gaussian belief updaters.

Subclasses implement a Bayesian predict-correct cycle that maps (mean, covariance, action, observation) to an updated (new_mean, new_covariance) pair.

Note

This is an abstract base class and cannot be instantiated directly.

abstract property config_id: str: Return a deterministic identifier for this updater configuration.

abstractmethod update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean (ndarray) – Prior mean vector of shape (d,).
covariance (ndarray) – Prior covariance matrix of shape (d, d).
action (ndarray) – Action that was executed.
observation (ndarray) – Observation that was received.

Return type:

Tuple[ndarray, ndarray]

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.gaussian_belief_updaters.LinearKalmanFilterUpdater(A, B, H, Q, R)[source]

Kalman filter updater for a linear-Gaussian system.

The system model is:

x_{t+1} = A x_t + B u_t + w, w ~ N(0, Q) z_t = H x_{t+1} + v, v ~ N(0, R)

Parameters:

A: State transition matrix of shape (d, d).

B: Control input matrix of shape (d, m).

H: Observation matrix of shape (p, d).

Q: Process noise covariance of shape (d, d).

R: Observation noise covariance of shape (p, p).

Example

>>> import numpy as np
>>> A = np.eye(2)
>>> B = np.zeros((2, 1))
>>> H = np.eye(2)
>>> Q = 0.1 * np.eye(2)
>>> R = 0.5 * np.eye(2)
>>> updater = LinearKalmanFilterUpdater(A=A, B=B, H=H, Q=Q, R=R)
>>> mean = np.zeros(2)
>>> cov = np.eye(2)
>>> new_mean, new_cov = updater.update(mean, cov, np.zeros(1), np.array([1.0, 0.0]))
>>> new_mean.shape
(2,)

property config_id: str: Return a deterministic identifier for this updater configuration.

update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean – Prior mean vector of shape (d,).
covariance – Prior covariance matrix of shape (d, d).
action – Action that was executed.
observation – Observation that was received.

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

class POMDPPlanners.core.belief.gaussian_belief_updaters.UnscentedKalmanFilterUpdater(transition_fn, observation_fn, Q, R, alpha=0.001, beta=2.0, kappa=0.0)[source]

UnweightedParticleBelief

Unscented Kalman Filter updater for nonlinear systems.

The system model is:

x_{t+1} = f(x_t, u_t) + w, w ~ N(0, Q) z_t = h(x_{t+1}) + v, v ~ N(0, R)

Unlike the EKF, the UKF does not require Jacobians. Instead, it propagates deterministic sigma points through the nonlinear functions to estimate the posterior statistics.

Parameters:

transition_fn (Callable[[ndarray, ndarray], ndarray])
observation_fn (Callable[[ndarray], ndarray])
Q (ndarray)
R (ndarray)
alpha (float)
beta (float)
kappa (float)

transition_fn: State transition function f(state, action) -> next_state.

observation_fn: Observation function h(state) -> observation.

Q: Process noise covariance of shape (d, d).

R: Observation noise covariance of shape (p, p).

alpha: Spread of sigma points around the mean.

beta: Prior knowledge about the distribution (2.0 is optimal for Gaussian).

kappa: Secondary scaling parameter.

Example

>>> import numpy as np
>>> f = lambda x, u: x
>>> h = lambda x: x
>>> Q = 0.1 * np.eye(2)
>>> R = 0.5 * np.eye(2)
>>> updater = UnscentedKalmanFilterUpdater(
...     transition_fn=f, observation_fn=h, Q=Q, R=R,
... )
>>> mean = np.zeros(2)
>>> cov = np.eye(2)
>>> new_mean, new_cov = updater.update(mean, cov, np.zeros(1), np.array([1.0, 0.0]))
>>> new_mean.shape
(2,)

property config_id: str: Return a deterministic identifier for this updater configuration.

update(mean, covariance, action, observation)[source]

Perform a single predict-correct belief update.

Parameters:

mean – Prior mean vector of shape (d,).
covariance – Prior covariance matrix of shape (d, d).
action – Action that was executed.
observation – Observation that was received.

Returns:

A tuple (new_mean, new_covariance) representing the posterior Gaussian.

POMDPPlanners.core.belief.gaussian_mixture_belief module

Gaussian Mixture belief state representation for POMDP environments.

This module provides a Gaussian Mixture Model (GMM) belief state that represents the posterior as a weighted mixture of multivariate Gaussians. Updates are delegated to a GaussianMixtureBeliefUpdater instance, following the same dependency injection pattern as GaussianBelief.

Classes:: GaussianMixtureBeliefUpdater: ABC for GMM belief update strategies. GaussianMixtureBelief: GMM belief with pluggable updater.

class POMDPPlanners.core.belief.gaussian_mixture_belief.GaussianMixtureBelief(means, covariances, weights, updater, n_terminal_check_samples=50)[source]

Bases: Belief

Gaussian Mixture Model belief state representation.

Represents the belief as a weighted mixture of multivariate normal distributions: p(x) = sum_k w_k * N(x; mu_k, Sigma_k). The update mechanism delegates to a GaussianMixtureBeliefUpdater instance, allowing flexibility in how mixture components are updated, pruned, or merged.

This belief type is compatible with PFT_DPW, Sparse-PFT, and SparseSampling planners. It is NOT compatible with POMCP/POMCP_DPW planners because it does not support incremental particle accumulation via inplace_update().

Parameters:

means (List[ndarray])
covariances (List[ndarray])
weights (ndarray)
updater (GaussianMixtureBeliefUpdater)
n_terminal_check_samples (int)

means: List of mean vectors, one per component.

covariances: List of covariance matrices, one per component.

weights: Array of mixture weights summing to 1.

updater: GaussianMixtureBeliefUpdater that computes the Bayesian belief update.

n_terminal_check_samples: Number of Monte Carlo samples for terminal checks.

Example

>>> import numpy as np
>>> np.random.seed(42)
>>>
>>> # Define a simple updater that shrinks covariances
>>> from POMDPPlanners.core.belief.gaussian_mixture_belief import (
...     GaussianMixtureBeliefUpdater,
... )
>>> class ShrinkUpdater(GaussianMixtureBeliefUpdater):
...     def update(self, means, covs, weights, action, obs):
...         return means, [c * 0.9 for c in covs], weights
...     @property
...     def config_id(self):
...         return "shrink"
>>>
>>> # Create a 2-component GMM belief in 2D
>>> means = [np.array([0.0, 0.0]), np.array([3.0, 3.0])]
>>> covs = [np.eye(2), np.eye(2)]
>>> weights = np.array([0.5, 0.5])
>>> belief = GaussianMixtureBelief(
...     means=means, covariances=covs, weights=weights, updater=ShrinkUpdater(),
... )
>>>
>>> # Sample a state
>>> state = belief.sample()
>>> len(state) == 2
True
>>>
>>> # Update belief
>>> new_belief = belief.update(
...     action=0, observation=np.array([1.0, 1.0]), pomdp=None
... )
>>> new_belief.n_components == 2
True

property config_id: str: Generate a deterministic identifier based on belief configuration.

property dim: int: Return the dimensionality of the belief state.

entropy(n_samples=1000)[source]

Estimate the differential entropy via Monte Carlo sampling.

There is no closed-form expression for the entropy of a Gaussian mixture, so this method uses the approximation:

H ~ -mean(log p(x_i)), x_i ~ p(x)

Parameters:: n_samples (int) – Number of Monte Carlo samples. Defaults to 1000.
Return type:: float
Returns:: Estimated differential entropy in nats.

property n_components: int: Return the number of mixture components.

sample()[source]

Sample a state from the Gaussian mixture belief.

Selects a component according to the mixture weights, then draws a sample from that component’s Gaussian distribution.

Return type:: ndarray
Returns:: A state vector of shape (d,).

update(action, observation, pomdp=None, state=None)[source]

Update belief using the provided updater.

Parameters:

action (Any) – Action that was executed.
observation (Any) – Observation that was received.
pomdp (Optional[Environment]) – Unused. Kept for interface compatibility with Belief.
state (Optional[Any]) – Ignored for Gaussian mixture beliefs.

Return type:

GaussianMixtureBelief

Returns:

New GaussianMixtureBelief with updated components and weights.

class POMDPPlanners.core.belief.gaussian_mixture_belief.GaussianMixtureBeliefUpdater[source]

Bases: ABC

Abstract base class for Gaussian mixture belief updaters.

Subclasses implement an update cycle that maps (means, covariances, weights, action, observation) to an updated (new_means, new_covariances, new_weights) tuple.

Note

This is an abstract base class and cannot be instantiated directly.

abstract property config_id: str: Return a deterministic identifier for this updater configuration.

abstractmethod update(means, covariances, weights, action, observation)[source]

Perform a belief update for the Gaussian mixture.

Parameters:

means (List[ndarray]) – List of k mean vectors, each of shape (d,).
covariances (List[ndarray]) – List of k covariance matrices, each of shape (d, d).
weights (ndarray) – Mixture weights of shape (k,).
action (Any) – Action that was executed.
observation (Any) – Observation that was received.

Return type:

Tuple[List[ndarray], List[ndarray], ndarray]

Returns:

A tuple (new_means, new_covariances, new_weights).

POMDPPlanners.core.belief.particle_beliefs module

Particle-based belief state representations for POMDP environments.

This module provides particle filter implementations for approximate belief tracking, including both weighted and unweighted variants, with support for reinvigoration and incremental state accumulation.

Classes:: UnweightedParticleBelief: Uniform particle filter for discrete observation spaces WeightedParticleBelief: Weighted particle filter for continuous observation spaces WeightedParticleBeliefReinvigoration: Extended weighted filter with reinvigoration WeightedParticleBeliefStateUpdate: Incremental weighted particle belief for online learning UnweightedParticleBeliefStateUpdate: Incremental unweighted particle belief
Functions:: get_unique_support: Extract unique particles and their combined probabilities

class POMDPPlanners.core.belief.particle_beliefs.UnweightedParticleBelief(particles, reinvigoration_fraction=0.2)[source]

Bases: Belief

Unweighted particle belief implementation.

This class implements a particle filter with uniform particles.

Parameters:: particles (list)

reinvigorate(action, observation, pomdp)[source]

Simulate a new particle that matches the action-observation pair.

Parameters:

action (Any)
observation (Any)
pomdp (Environment)

sample()[source]: Sample a particle from the belief.

update(action, observation, pomdp, state=None)[source]

Update belief with action-observation pair.

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
state (Any | None)

class POMDPPlanners.core.belief.particle_beliefs.UnweightedParticleBeliefStateUpdate(particles=None)[source]

Bases: Belief

Uniform particle belief for incremental state accumulation.

This class implements a lightweight belief representation that maintains a collection of state particles with uniform probability distribution. Unlike weighted particle filters, all particles contribute equally to the belief state, making it suitable for discrete observation spaces where observation likelihoods are binary (match/no-match) rather than continuous probability distributions.

UnweightedParticleBeliefStateUpdate is designed for online learning and planning algorithms that incrementally accumulate particles during tree expansion or sequential state estimation. It provides both in-place and immutable update operations for different algorithmic requirements.

Key Features: - Uniform Weighting: All particles have equal probability weight - Incremental Accumulation: Add particles one-by-one without resampling - Memory Efficient: No weight storage, minimal memory overhead - Fast Sampling: Simple uniform random sampling from particle set - Efficient Updates: Both in-place and immutable update operations - Deterministic Config ID: Order-invariant identification for caching

Mathematical Foundation: The belief represents a discrete uniform distribution over accumulated particles. Each particle has equal probability 1/N where N is the total number of particles. For particles with the same state value, the probability is proportional to their count:

P(s) = count(s) / N = |{i: s_i = s}| / |particles|

This makes it ideal for discrete observation models where observations either match a state (probability 1) or don’t match (probability 0).

Parameters:: particles (list | None)

particles: List of state particles, each with uniform probability

weights_sum: Total number of particles (equivalent to uniform weight sum)

Example

>>> from POMDPPlanners.environments.tiger_pomdp import TigerPOMDP
>>> env = TigerPOMDP(discount_factor=0.95)
>>> belief = UnweightedParticleBeliefStateUpdate(particles=[])
>>> belief.inplace_update("listen", "hear_left", env, "tiger_left")
>>> belief.inplace_update("listen", "hear_right", env, "tiger_right")
>>> sampled_state = belief.sample()
>>> sampled_state in ["tiger_left", "tiger_right"]
True

property config_id: str

Generate a deterministic identifier based on belief configuration.

This implementation ensures that config_id is invariant to the order of particles by sorting them.

inplace_update(action, observation, pomdp, state=None)[source]

Add a state particle with uniform weight to current belief.

This method modifies the current belief in-place by appending a new particle. Unlike weighted beliefs, no observation likelihood computation is performed; the new particle simply joins the uniform distribution.

Parameters:

action (Any) – Action that was executed to reach the state (not used for weighting)
observation (Any) – Observation received after executing the action (not used for weighting)
pomdp (Environment) – Environment instance (not used for uniform weighting)
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

UnweightedParticleBeliefStateUpdate

sample()[source]

Sample a state uniformly from the current belief distribution.

Return type:: Any
Returns:: A state sampled uniformly from the particle set
Raises:: IndexError – If belief is empty (no particles to sample from)

update(action, observation, pomdp, state=None)[source]

Create new belief by adding a state particle with uniform weight.

This method creates a new belief instance without modifying the current one. Unlike weighted beliefs, all particles (including the new one) have equal probability in the resulting belief distribution.

Parameters:

action (Any) – Action that was executed to reach the state (not used for weighting)
observation (Any) – Observation received after executing the action (not used for weighting)
pomdp (Environment) – Environment instance (not used for uniform weighting)
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

Returns:

New UnweightedParticleBeliefStateUpdate instance with the additional particle.

class POMDPPlanners.core.belief.particle_beliefs.WeightedParticleBelief(particles, log_weights, resampling=False, ess_factor=0.5)[source]

Bases: Belief

Weighted particle filter implementation for POMDP belief states.

This class implements a particle filter with weighted particles, suitable for continuous observation spaces. It supports automatic resampling based on effective sample size to maintain particle diversity.

Parameters:

particles (List[Any])
log_weights (ndarray)
resampling (bool)
ess_factor (float)

particles: List of state particles representing the belief

log_weights: Log-weights of particles in log space for numerical stability

normalized_weights: Normalized probability weights (computed automatically)

resampling: Whether automatic resampling is enabled

ess_factor: Effective sample size factor for resampling threshold

ess_threshold: Computed threshold for triggering resampling

eps: Small epsilon value for numerical stability in weight updates

Example

>>> import numpy as np
>>> # Create belief with 10 particles (smaller for testing)
>>> particles = [[x, y] for x, y in zip(np.random.randn(10), np.random.randn(10))]
>>> log_weights = np.log(np.ones(10) / 10)  # Uniform weights

>>> belief = WeightedParticleBelief(
...     particles=particles,
...     log_weights=log_weights,
...     resampling=True,
...     ess_factor=0.5
... )
>>> # Sample a state from belief
>>> state = belief.sample()
>>> len(state) == 2  # [x, y] coordinate
True

property config_id: str: Generate a deterministic identifier based on belief configuration.

sample()[source]: Sample a particle from the belief.

to_dict()[source]

Convert the belief to a dictionary for serialization.

Returns:: A dictionary containing all necessary fields for deserialization.
Return type:: dict

to_unique_support_distribution()[source]

Convert the belief to a DiscreteDistribution with unique particles.

Returns:: A distribution where each particle appears only once, with its probability being the sum of all its occurrences in the original belief.
Return type:: DiscreteDistribution

update(action, observation, pomdp, state=None)[source]

Update belief with action-observation pair.

Return type:

Parameters:

pomdp (Environment)
state (Any | None)

class POMDPPlanners.core.belief.particle_beliefs.WeightedParticleBeliefReinvigoration(particles, log_weights, resampling=True, ess_factor=0.5, reinvigoration_fraction=0.2)[source]

Bases: WeightedParticleBelief, ABC

Weighted particle belief with reinvigoration capability.

Parameters:

particles (List[Any])
log_weights (ndarray)
resampling (bool)
ess_factor (float)
reinvigoration_fraction (float)

abstractmethod reinvigorate(action, observation, pomdp, belief)[source]

Implement reinvigoration for specific POMDP environment.

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
belief (Belief)

update(action, observation, pomdp, state=None)[source]

Update belief with reinvigoration.

Return type:

Parameters:

action (Any)
observation (Any)
pomdp (Environment)
state (Any | None)

class POMDPPlanners.core.belief.particle_beliefs.WeightedParticleBeliefStateUpdate(particles=None, weights=None)[source]

Bases: Belief

Incremental weighted particle belief for online state estimation.

This class implements a lightweight belief representation that incrementally accumulates state particles with associated observation likelihood weights. It is designed for online learning and planning algorithms that build beliefs by sequentially adding individual state samples rather than maintaining a fixed-size particle set.

Unlike traditional particle filters that maintain a fixed number of particles through resampling, WeightedParticleBeliefStateUpdate grows dynamically by accumulating particles with observation-based weights. This makes it particularly suitable for Monte Carlo Tree Search (MCTS) algorithms where beliefs are built incrementally during tree expansion.

Key Features: - Incremental Accumulation: Add particles one-by-one without resampling - Observation Weighting: Each particle weighted by observation likelihood - Efficient Updates: Both in-place and immutable update operations - Weighted Sampling: Sample states proportionally to observation evidence - Memory Efficient: No fixed particle budget, grows as needed - Deterministic Config ID: Order-invariant identification for caching

Mathematical Foundation: The belief represents a discrete probability distribution where each particle s_i has weight w_i = P(o|s_i,a), the observation likelihood. The probability of state s is proportional to the sum of weights for all particles with that state:

P(s|o,a) ∝ Σ_{i: s_i=s} w_i

Parameters:

particles (list | None)
weights (list | None)

particles: List of state particles representing possible world states

weights: List of observation likelihood weights for each particle

weights_sum: Running sum of all weights for efficient normalization

Example

>>> from POMDPPlanners.environments.tiger_pomdp import TigerPOMDP
>>> env = TigerPOMDP(discount_factor=0.95)
>>> belief = WeightedParticleBeliefStateUpdate(particles=[], weights=[])
>>> belief.inplace_update("listen", "hear_left", env, "tiger_left")
>>> belief.inplace_update("listen", "hear_left", env, "tiger_right")
>>> sampled_state = belief.sample()
>>> sampled_state in ["tiger_left", "tiger_right"]
True

property config_id: str

Generate a deterministic identifier based on belief configuration.

This implementation ensures that config_id is invariant to the order of particles and weights, similar to WeightedParticleBelief.

inplace_update(action, observation, pomdp, state=None)[source]

Add a state particle with observation weight to current belief.

This method modifies the current belief in-place by appending a new particle and its corresponding observation likelihood weight. The weight is computed using the environment’s observation model and efficiently updates the running weight sum.

Parameters:

action (Any) – Action that was executed to reach the state
observation (Any) – Observation received after executing the action
pomdp (Environment) – Environment providing the observation model for weight computation
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

WeightedParticleBeliefStateUpdate

sample()[source]

Sample a state from the current belief distribution.

Return type:: Any
Returns:: A state sampled according to the belief’s probability distribution
Raises:: ValueError – If belief is empty or has zero weights.

to_unique_support_distribution()[source]

Convert the belief to a DiscreteDistribution with unique particles.

Returns:: A distribution where each particle appears only once, with its probability being the sum of all its occurrences in the original belief.
Return type:: DiscreteDistribution

update(action, observation, pomdp, state=None)[source]

Create new belief by adding a state particle with observation weight.

This method creates a new belief instance without modifying the current one. The new particle’s weight is computed as the observation likelihood given the state and action using the environment’s observation model.

Parameters:

action (Any) – Action that was executed to reach the state
observation (Any) – Observation received after executing the action
pomdp (Environment) – Environment providing the observation model for weight computation
state (Optional[Any]) – State particle to add to the belief. If None, no particle is added.

Return type:

Returns:

New WeightedParticleBeliefStateUpdate instance with the additional particle and updated weights.

POMDPPlanners.core.belief.particle_beliefs.get_unique_support(particles, probabilities)[source]

Extract unique particles and their combined probabilities.

This function takes a list of particles and their associated probabilities, combines probabilities for duplicate particles, and returns unique particles with normalized probabilities.

Parameters:

particles (List[Any]) – List of particles of any type
probabilities (ndarray) – Array of probabilities/weights corresponding to each particle

Returns:

List of unique particles (preserving original types)
Normalized numpy array of probabilities summing to 1

Return type:

Tuple[List[Any], ndarray]

Example

>>> particles = [1, 2, 1, 3, 2]
>>> probs = np.array([0.2, 0.3, 0.1, 0.2, 0.2])
>>> unique_particles, unique_probs = get_unique_support(particles, probs)
>>> unique_particles  # [1, 2, 3]
[1, 2, 3]
>>> float(np.sum(unique_probs))  # Should be 1.0
1.0

POMDPPlanners.core.belief.vectorized_particle_belief_updater module

Abstract base class for vectorized particle belief updaters.

This module provides the VectorizedParticleBeliefUpdater ABC that defines a batched interface for particle belief updates. Concrete implementations perform all-particle transitions and observation log-likelihood evaluations using vectorized NumPy operations, eliminating Python-level loops over individual particles.

Classes:: VectorizedParticleBeliefUpdater: ABC for batched particle belief updates.

class POMDPPlanners.core.belief.vectorized_particle_belief_updater.VectorizedParticleBeliefUpdater[source]

Bases: ABC

Abstract base class for vectorized particle belief updaters.

Subclasses implement batched transition and observation log-likelihood methods that operate on the full particle array at once, enabling NumPy-level vectorization instead of Python loops.

Note

This is an abstract base class and cannot be instantiated directly.

abstractmethod batch_observation_log_likelihood(next_particles, action, observation)[source]

Compute observation log-likelihoods for all particles at once.

Parameters:

next_particles (ndarray) – Transitioned particle states of shape (N, d).
action (ndarray) – Action vector.
observation (ndarray) – Observed value.

Return type:

Returns:

Log-likelihoods of shape (N,).

abstractmethod batch_transition(particles, action)[source]

Transition all particles in a single batched operation.

Parameters:

particles (ndarray) – Current particle states of shape (N, d).
action (ndarray) – Action vector.

Return type: