POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero package

Submodules

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_belief_representation module

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero module

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler module

Tests for BetaZeroActionSampler network-guided action sampling.

This module tests the BetaZeroActionSampler class including fallback behaviour, discrete and continuous network-guided sampling, and pickle serialisation.

class POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.SimpleFallbackSampler[source]

Bases: ActionSampler

sample(belief_node=None)[source]

Sample a new action for progressive widening.

Parameters:

belief_node – Optional belief node context for informed sampling

Returns:

A sampled action compatible with the environment’s action space

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_continuous_sampling_centered_on_predicted_mean()[source]

Test that continuous samples are in a reasonable range around the network mean.

Purpose: Validates that for a continuous action space, the sampler produces

action vectors whose components are within a plausible range of the network’s predicted mean (not wildly divergent).

Given: A BetaZeroActionSampler with a continuous BetaZeroNetwork (action_dim=2),

a ParticleMeanStdRepresentation, and no discrete actions list.

When: sample is called 50 times with a valid belief node. Then: All samples are finite numpy arrays of the correct shape (action_dim,),

and the empirical mean of samples is within 5 standard deviations of the network’s predicted mean.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_discrete_sampling_follows_policy()[source]

Test that discrete sampling produces a non-uniform distribution guided by the network.

Purpose: Validates that when a network and belief representation are attached,

the sampler draws actions according to the network’s softmax policy rather than uniformly at random.

Given: A BetaZeroActionSampler with a discrete BetaZeroNetwork, a

ParticleMeanStdRepresentation, and a list of three actions.

When: sample is called 100 times with a valid belief node. Then: The distribution of sampled actions is non-uniform. Specifically, the

most frequently sampled action is selected more often than 1/3 of the time (the uniform expectation), indicating the network policy influences sampling.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_fallback_without_belief_node()[source]

Test that the fallback sampler is used when belief_node is None.

Purpose: Validates that BetaZeroActionSampler delegates to the fallback

sampler when no belief node context is provided.

Given: A BetaZeroActionSampler with a SimpleFallbackSampler and no

network/representation attached.

When: sample is called with belief_node=None. Then: The returned action equals “fallback_action” from the fallback sampler.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_fallback_without_network()[source]

Test that the fallback sampler is used when the network is not set.

Purpose: Validates that BetaZeroActionSampler delegates to the fallback

sampler when set_network_and_representation has not been called, even if a valid belief node is provided.

Given: A BetaZeroActionSampler without network/representation and a valid

belief node.

When: sample is called with the belief node. Then: The returned action equals “fallback_action” from the fallback sampler.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_pickle_round_trip()[source]

Test full pickle.dumps → pickle.loads round trip for BetaZeroActionSampler.

Purpose: Validates that BetaZeroActionSampler can be successfully pickled

and unpickled using the full pickle protocol, which is critical for joblib/multiprocessing compatibility.

Given: A BetaZeroActionSampler with network and representation attached. When: The sampler is pickled with pickle.dumps and then unpickled with

pickle.loads (simulating joblib/multiprocessing serialization).

Then: The unpickled sampler retains all non-network attributes (fallback,

actions, noise_scale), has network and representation set to None, and correctly delegates to fallback when used.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_pickle_round_trip_continuous()[source]

Test full pickle round trip for BetaZeroActionSampler with continuous actions.

Purpose: Validates that BetaZeroActionSampler for continuous action spaces

can be successfully pickled and unpickled.

Given: A BetaZeroActionSampler configured for continuous actions with

network and representation attached.

When: The sampler is pickled and unpickled via pickle.dumps/loads. Then: The unpickled sampler preserves all attributes and functions correctly.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_action_sampler.test_pickle_serialization()[source]

Test that BetaZeroActionSampler serialisation strips the network.

Purpose: Validates that __getstate__ removes the network and belief

representation, that pickle.dumps succeeds, and that a manual reconstruction via __setstate__ restores a working sampler that falls back correctly.

Given: A BetaZeroActionSampler with network and representation attached. When: __getstate__ is called (as pickle.dumps does internally), the state

is inspected, and a fresh instance is reconstructed via __setstate__.

Then: The serialised state has _network and _belief_representation set to

None, pickle.dumps succeeds, and the reconstructed sampler retains the fallback sampler, actions list, and noise_scale, and delegates to the fallback when sampled without re-attaching a network.

Test type: unit

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_beta_zero_network module

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_puct module

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_training module

POMDPPlanners.tests.test_planners.test_mcts_planners.test_beta_zero.test_training_buffer module