Intention-aware motion planning
Motivation: Motion planning with uncertainty in human intention.
Assumptions
- A finite set of unknown intentions.
- Given intention, the agent’s dynamics is modeled and known to robot.
- The agent has perfect information on the robot’s and its own state.
Idea:
- Model intent-aware motion planning as a Partially Observable Markov Decision Process.
- The agent’s intention is the primary uncertainty state variable in MOMDP.
Preliminaries
- MDP allows to model action uncertainty only and the state is fully obserable.
- POMDP specifies p(o|s’,a), which models observation uncertainty.
- In POMDP, state is not known and is represented as a belief b(s).
- A POMDP policy induces a value function mapping from b to reward.
- Each alpha-vector defines a hyperplane over B. The value function V can be represented as a finite set of hyperplanes.
Method
- In the offline stage, construct a motion model for each agent intention; in the online stage, infer over a finite set of agent intentions and act accordingly.
- Modelling
- Each intention type corresponds to an agent policy $\rou: XxYx\theta$.
- $\rou$ can be computed by solving a simplified MDP. Assume the pedestrian follows shortest path and avoids collision.
- Execution
- MOMDP policy, represented as a value function $V(x, y, b_\theta)$.
- First selects an action for current belief: $V(x, y, b_\theta) = max_{a} {a * b_\theta}$
- Then update the belief: $b’\theta = Z(x’,y’, o) \sum{\theta}{T_x(x, a, x’) * T_y{x, y, \theta, y’} * b(\theta)}$