pomdp-belief-tracking

This packages intends to be a library to be used by others. We identify two core components: belief representations and their updates. A StateDistribution is a distribution over the state that can be sampled from, while the BeliefUpdate takes a current belief, action and observation and produces a next belief.

The belief at its core is distributions from which states can be sampled. Depending on the specific type, more functionality can be expected. However, for most purposes this definition will suffice:

StateDistribution.__call__()[source]

Required implementation of distribution: the ability to sample states

Return type

State

Returns

state sampled according to distribution

Similarly, the exact detail of the belief update will differ immensely, and some update functions are only applicable to specific beliefs. Here we adopt the following definition:

BeliefUpdate.__call__(p, a, o)[source]

Updates the distribution p given an action and observation

Parameters
Return type

Tuple[StateDistribution, Dict[str, Any]]

Returns

next distribution

Where Info is a dictionary that stores information or context that can be populated by the belief update for reporting and such.

Design

A quick note on some design choices that have been made.

API

My preferred style of coding is functional, where state and mutability can be avoided as much as possible. Hence, most of the code here is written from that perspective, and the belief update functionality is provided through a functional interface.

However, the belief is a crucial part and must be represented by some data structure. Additionally not all belief updates can work with all beliefs. Hence it can be much to ask for users to update and maintain them by themselves. As a result, we provide an actual Belief that binds the two together.

class pomdp_belief_tracking.types.Belief(initial_distribution, update_function)[source]

A belief is the combination of a update function and current distribution

sample()[source]

Samples from its distribution

Return type

State

Returns

state sampled according to distribution

update(a, o)[source]

Updates (in place) the state distribution given an action and observation

Parameters
Return type

Dict[str, Any]

Returns

Side effect: updates in place, returns run-time info

Types

I am unreasonably terrified of dynamic typed languages and have gone to extremes to define as many as possible. Most of these are for internal use, but you will come across some as a user of this library. Most of these types will have no actual meaning, in particular:

pomdp_belief_tracking.types.Action

The abstract type representing actions

pomdp_belief_tracking.types.Observation

The abstract type representing observations

pomdp_belief_tracking.types.State

The abstract type representing states

Are domain specific and unimportant for implementation details. They are merely used to allow type-checking and catching trivial bugs.

A notable exception is the Simulator, which is assumed to a callable that samples transitions.

Belief.__call__(**kwargs)

Call self as a function.