Human-Robot-Mutual

Formalizing Human-Robot Mutual Adaptation: A Bounded Memory Model

Idea: the robot reasons how human may change its strategy, based on a model of human adaptation.
Motivation:

Previous works do not use a model of human adaptation that can enable the robot to actively influence the actions of human.
Previous works do not reason over the human adaptation throughout the interaction. Compare with Intention-Aware Motion Planning paper.

Assumptions

Definition of m_r, m_h requires a specific collaborative task (rotating table as in paper).

Preliminaries

f maps QxArxAh into modes M: {0,1}.
Q=X_state x H_k at time step i is reprensented eventually as m_{h}^{i}, m_{r}^{i}, computed over a history of length k.

Method
Modelling (adaptable human): Human adaptability a is accounted for via transition function P: Q to P(Q). At state q, the human will choose action specified by m_r with probability a, or m_h with 1-a.
Planning (adaptable robot):

S: X x Y; X: X_world x M^{k} x M^{k}. M is the mode of robot/human; Y is partial observable variable, adaptabiity.
human policy $\pi_h$ is from BAM model. It outputs m_h or m_r.
The belief is based on unobserable variable a. Therefore, MOMDP maps from V(Q, b(a)) to a_r. Then BAM models gives human action a_h. Then use Eqn.3 (below) to update the belief.