0%

Planning for Autononous Cars that Leverage Effects on Human Actions

Idea:

  1. Action from the autonomous car will affect human responses and these could be leveraged for planning.
  2. Approximate human as optimal driver, with a reward function acquired through inverse reinforcement learning.

Motivation:

  1. Current autonomous cars are defensive
  2. Plan more efficient and communicative behaviors for autonomous cars.

Assumptions

  1. Two car system.

Method

  1. The robot will use MPC at every iteration by computing a finite sequence of actions to maximize its reward and then execute the first one. (Eqn 5).

    alt exg
  2. Compute u^{*}_H by optimizing the following:

    alt exg
  3. Learn r_h with inverse reinforcement learning (separate optimization process that maximizes probability of demonstrations).

Implementations
Use theano to compute jacobian and hessian symbolically and use L-BFGS to optimize Eqn 5. (code: https://github.com/dsadigh/driving-interactions/blob/master/utils.py)