pomdp-belief-tracking¶
This packages intends to be a library to be used by others. We identify two
core components: belief representations and their updates. A
StateDistribution is a distribution over
the state that can be sampled from, while the
BeliefUpdate takes a current belief,
action and observation and produces a next belief.
The belief at its core is distributions from which states can be sampled. Depending on the specific type, more functionality can be expected. However, for most purposes this definition will suffice:
-
StateDistribution.__call__()[source] Required implementation of distribution: the ability to sample states
- Return type
- Returns
state sampled according to distribution
Similarly, the exact detail of the belief update will differ immensely, and some update functions are only applicable to specific beliefs. Here we adopt the following definition:
-
BeliefUpdate.__call__(p, a, o)[source] Updates the distribution
pgiven an action and observation- Parameters
p (
StateDistribution) – current distributiona (
Action) – taken actiono (
Observation) – perceived observation
- Return type
Tuple[StateDistribution,Dict[str,Any]]- Returns
next distribution
Where Info is a dictionary that stores
information or context that can be populated by the belief update for reporting
and such.
Design¶
A quick note on some design choices that have been made.
API¶
My preferred style of coding is functional, where state and mutability can be avoided as much as possible. Hence, most of the code here is written from that perspective, and the belief update functionality is provided through a functional interface.
However, the belief is a crucial part and must be represented by some data
structure. Additionally not all belief updates can work with all beliefs. Hence
it can be much to ask for users to update and maintain them by themselves. As a
result, we provide an actual Belief that
binds the two together.
-
class
pomdp_belief_tracking.types.Belief(initial_distribution, update_function)[source] A belief is the combination of a update function and current distribution
-
sample()[source] Samples from its distribution
- Return type
- Returns
state sampled according to distribution
-
update(a, o)[source] Updates (in place) the state distribution given an action and observation
- Parameters
a (
Action) – the executed actiono (
Observation) – the perceived observation
- Return type
Dict[str,Any]- Returns
Side effect: updates in place, returns run-time info
-
Types¶
I am unreasonably terrified of dynamic typed languages and have gone to extremes to define as many as possible. Most of these are for internal use, but you will come across some as a user of this library. Most of these types will have no actual meaning, in particular:
The abstract type representing actions |
|
The abstract type representing observations |
|
The abstract type representing states |
Are domain specific and unimportant for implementation details. They are merely used to allow type-checking and catching trivial bugs.
A notable exception is the Simulator,
which is assumed to a callable that samples transitions.
-
Belief.__call__(**kwargs) Call self as a function.